v2.0.0-beta.1
版本发布时间: 2023-12-04 23:20:53
deepset-ai/haystack最新发布版本:v2.4.0(2024-08-15 17:39:00)
Introduction
We are happy to officially share Haystack 2.0-beta with you. The new version is a complete rework of the pipeline, our core concept, with production readiness, ease of use, and customizability in mind.
Haystack 2.0-Beta Documentation. Check the available features in this Beta release (see section below). Try out Haystack 2.0-Beta in “Advent of Code”.
What does the “Beta” mean for me?
Production readiness means also caring about stability. Therefore, we decided to release a beta version now and test it thoroughly in public over the next weeks. We will add more features and we might add breaking changes until the stable 2.0 release in late Q1 2024.
We invite you to try this beta version and give candid feedback, it will be heard and we will change Haystack accordingly. We’ve put together 10 code challenges for you in our “Advent of Haystack” to get your hands on it. We don’t recommend migrating your production pipelines yet to 2.0 beta.
We will support Haystack 1.x with updates and important features being added to the codebase even after the final 2.0.0 release, to give users time to migrate.
⭐️ What’s changed?
For a detailed overview of what’s changed in this Beta release, check out our article “Introducing Haystack 2.0 and Advent of Haystack”.
The bulk of the work in this release introduces changes to the fundamental design of:
In the last few months, we've been working with our community members and partners to already start adding some integrations for Haystack 2.0. Today, along with the beta package you can also try integrations tagged with Haystack 2.0
in our Integration inventory!
🚀 Getting started
One way to get started with Haystack 2.0 Beta is to participate in the “Advent of Haystack” and give us feedback on how you got along.
To install the new package:
pip install haystack-ai
To use a simple RAG pipeline:
from haystack import Document
from haystack.document_stores import InMemoryDocumentStore
from haystack.pipeline_utils import build_rag_pipeline
API_KEY = "sk-xxx" # ADD YOUR OPENAI API KEY
# We support many different databases. Here we load a simple and lightweight in-memory document store.
document_store = InMemoryDocumentStore()
# Create some example documents and add them to the document store.
documents = [
Document(content="My name is Jean and I live in Paris."),
Document(content="My name is Mark and I live in Berlin."),
Document(content="My name is Giorgio and I live in Rome."),
]
document_store.write_documents(documents)
# Let's now build a simple RAG pipeline that uses a generative model to answer questions.
rag_pipeline = build_rag_pipeline(llm_api_key=API_KEY, document_store=document_store)
answers = rag_pipeline.run(query="Who lives in Rome?")
print(answers.data)
For more details on how to get started see: https://docs.haystack.deepset.ai/v2.0/docs/get_started
🪶 List of Features
✅ Ready in this Beta release
🏗️ Under construction
Feature | Haystack 2.0-Beta |
---|---|
Document Stores | |
InMemoryDocumentStore | ✅ |
ElasticsearchDocumentstore | ✅ |
OpenSearchDocumentStore | ✅ |
ChromaDocumentStore | ✅ |
MarqoDocumentStore | ✅ |
FAISSDocumentStore | 🏗️ |
PineconeDocumentStore | 🏗️ |
WeaviateDocumentStore | 🏗️ |
MilvusDocumentStore | 🏗️ |
QdrantDocumentStore | 🏗️ |
PGVectorDocumentStore | 🏗️ |
MongoDBAtlasDocumentStore | 🏗️ |
Generators | |
GPTGenerator | ✅ |
HuggingFaceLocalGenerator | ✅ |
HuggingFaceTGIGenerator | ✅ |
GradientGenerator | ✅ |
Anthropic - Claude | 🏗️ |
Cohere - generate | ✅ |
AzureGPT | 🏗️ |
AWS Bedrock | 🏗️ |
AWS SageMaker | 🏗️ |
PromptNode | 🏗️ |
PromptBuilder | ✅ |
AnswerBuilder | ✅ |
Embedders | |
OpenAI Embedder | ✅ |
SentenceTransformers Embedder | ✅ |
Cohere - embed | 🏗️ |
Gradient Embedder (external) | ✅ |
Retrievers | |
InMemoryBM25Retriever | ✅ |
InMemoryEmbeddingRetriever | ✅ |
ElasticsearchBM25Retriever | ✅ |
ElasticsearchEmbeddingRetriever | ✅ |
OpensearchBM25Retriever | ✅ |
OpensearchEmbeddingRetriever | ✅ |
SerperDevWebSearch | ✅ |
MultiModalRetriever | 🏗️ |
TableTextRetriever | 🏗️ |
DensePassageRetriever | 🏗️ |
Rankers | |
TransformersSimilarityRanker | ✅ |
CohereRanker | 🏗️ |
DiversityRanker | 🏗️ |
LostInTheMiddleRanker | 🏗️ |
RecentnessRanker | 🏗️ |
MetaFieldRanker | ✅ |
Readers | |
ExtractiveReader | |
(successor of both FARMReader and TransformersReader) | ✅ |
TableReader | 🏗️ |
Data Processing | |
Local + Remote WhisperTranscriber | ✅ |
UrlCacheChecker | ✅ |
LinkContentFetcher | ✅ |
AzureOCRDocumentConverter | ✅ |
HTMLToDocument | ✅ |
PyPDFToDocument | ✅ |
TikaDocumentConverter | ✅ |
TextFileToDocument | ✅ |
MarkdownToDocument | ✅ |
DocumentCleaner | ✅ |
TextDocumentSplitter | ✅ |
TextLanguageClassifier | ✅ |
FileTypeRouter | ✅ |
MetadataRouter | ✅ |
DocumentWriter | ✅ |
DocumentJoiner | ✅ |
Misc | |
Evaluation | 🏗️ |
Agents | 🏗️ |
Conversational Agent | 🏗️ |
TopPSampler | ✅ |
TransformersSummarizer | 🏗️ |
TransformersTranslator | 🏗️ |