LangChain First Impressions: Powerful but Messy

Everyone building AI applications seems to be using LangChain. It's the most popular framework for building LLM-powered apps, with a massive community and a new integration added seemingly every day. I spent two weeks building a RAG (Retrieval Augmented Generation) system with it. My feelings are mixed.

What I Was Building

A simple concept: take a collection of technical documents, embed them into a vector database, and then let users ask questions that get answered using relevant context from those documents. This is the classic RAG pattern and it's exactly what LangChain is designed for.

The Good

LangChain gets you to a working prototype incredibly fast. In about 50 lines of Python, I had documents being loaded, chunked, embedded, stored in Chroma (an in-memory vector DB), and queried. The "getting started" experience is legitimately impressive.

The integrations are the real value. Need to load PDFs? There's a loader. Notion pages? There's a loader. GitHub repos? Loader. Confluence? Loader. Instead of writing custom parsers for each document format, LangChain has pre-built components for almost everything.

The chain abstraction is also powerful once you get it. You compose operations: retrieve context, format prompt, call LLM, parse output. Each step is a link in the chain. It's a clean mental model for building LLM pipelines.

The Bad

The abstraction layers run deep. Really deep. When something goes wrong, and it will, debugging means digging through five or six layers of abstraction to find out what's actually happening. A simple "call the OpenAI API" operation goes through a chain, which calls a language model wrapper, which calls an API wrapper, which finally makes the HTTP request. When you get a confusing error, good luck tracing it back to the actual cause.

The documentation is chaotic. LangChain moves fast and the docs lag behind. I found multiple examples in the docs that didn't work with the current version. Import paths change between releases. Class names get renamed. Deprecated methods linger in tutorials. I spent more time debugging import errors and deprecated APIs than I spent on my actual application logic.

The Python library is also just big. My requirements.txt exploded. LangChain pulls in dozens of dependencies, many of which I don't need. The JavaScript/TypeScript version is somewhat better in this regard but still heavier than I'd like.

The Ugly

Here's my main complaint. For many use cases, LangChain adds complexity without proportional value. My RAG system, stripped to its core, does three things: embed documents, find similar documents, and call the OpenAI API with context. I can write that in about 100 lines of plain Python with the OpenAI SDK and a vector database client. No framework needed.

With LangChain, my code was "simpler" in some ways (fewer lines of visible code) but much more complex in practice. The abstractions hide what's happening, which is fine when everything works and terrible when it doesn't. I spent an afternoon debugging why my retriever wasn't returning the right results, only to discover that a default parameter in a LangChain class was chunking my documents differently than I expected.

The framework also encourages a particular way of thinking about LLM applications that isn't always the best way. Not everything needs to be a chain. Not every interaction needs an agent with tools. Sometimes a direct API call with a well-crafted prompt is all you need.

When to Use LangChain

I think LangChain makes sense when you need multiple integrations (loading from various sources, using multiple LLMs, connecting to different vector stores) and want to swap components easily. If you're building something that needs to work with Pinecone today and Weaviate tomorrow, the abstraction layer pays for itself.

It also makes sense for prototyping. Getting a working demo in front of stakeholders fast has real value, and LangChain's speed to first result is hard to beat.

When to Skip It

If your use case is straightforward (one LLM, one vector store, one document source), just write it directly. Call the APIs yourself. You'll have less code, fewer dependencies, easier debugging, and a better understanding of what your system actually does.

I ended up rewriting my RAG system without LangChain. It took a day. The code is longer but I understand every line of it, debugging is straightforward, and it does exactly what I need without any framework opinions getting in the way.

LangChain is a powerful tool. But powerful tools aren't always the right tools. Know what you're building, assess whether the abstraction helps or hinders, and don't use a framework just because everyone else is.