Using Claude Code for Information Architecture

Out of the box, LLMs are most useful at granular tasks: summarizing a paper, drafting a post, outlining a story, translating a text, etc. Broader tasks — summarizing a whole book, for example — can prove challenging. The more text there is, the harder it is for LLMs to see the forest for the trees. But the forest is what we care about in information architecture.

There are techniques for working around this limitation. I’ve written about my experiments with using RAG with knowledge graphs. That approach showed promise, but has downsides. For one, it’s expensive (in terms of cost and energy.) It’s also time-consuming. But more importantly, it’s produced mixed results for IA use cases.

There are two primary tasks I want help with:

understanding large unstructured information sets during early project stages, and
producing draft taxonomies to organize large unstructured information sets.

In theory, graph RAG should help with both tasks. In practice, it’s been more useful for the first than the second. Whenever I prompt a graph RAG instance for suggestions on how to organize a corpus, I get back echoes of the graph’s structure. That’s not what I want.

Recently, I experimented with another approach that shows promise. It entails using a tool that isn’t designed for this purpose: Claude Code.

What is Claude Code?

Claude Code is a tool built by Anthropic to assist software developers. It uses Anthropic’s LLMs to create and modify software code on your behalf. You can think of it as an “agentic” coding assistant — that is, one that can do things to and with your code directly.

To do this, you give the tool access to a code repository on your computer. You interact with it via the command line. When you first call the agent on a folder containing software code, it ingests and indexes the code. It develops an internal representation of your codebase’s architecture, allowing it to work on the codebase as a whole — the forest view.

Claude Code provides a prompt-driven user interface. That is, you chat with it like you do with ChatGPT or the web-based version of Claude. But there are a few differences. For one, your codebase — and a representation of its structure — is pulled into the conversation context. With your permission, Claude Code also can modify the code. For example, it can fix bugs and run tests to validate whether they worked or not.

These aren’t things a regular chatbot can do. While you can tell the web version of Claude about your code by creating a project and uploading files or pointing it to a repo, regular Claude can only read those files into its context. In contrast, Claude Code can also modify the files. More importantly for our use case, Claude Code also structures internal representations of the codebase in ways optimized for development.

Use Case: Generating Draft Taxonomies

I hope you see where this is going. Code is just text, right? What if instead of (say) Python, we fed Claude Code the content of a website?

I tried it recently with jarango.com, and the results are promising. This site’s content is managed using Jekyll, a static site generator. Content is stored as plain text files (using Markdown markup) and processed into HTML at render time, which in my case happens either manually (for testing) or when I commit changes to a GitHub repository (for production.)

That’s a major advantage in this scenario: I can call up Claude Code in the folder on my computer where the site’s content lives and have it treat the Markdown files as “software code.” After the initial ingestion, I can prompt Claude Code on the entire corpus, much like I can using graph RAG — but with certain superpowers.

(If you’re using a dynamic CMS such as WordPress or Drupal, you’ll have to first export the site’s content as plain text files. There are Unix utilities that can assist with that process. I’ll leave it up to you to figure out how — but note LLMs are useful in this scenario as well: they know a lot about shell scripting!)

Back to the experiment. The initial ingestion process was very fast. I wondered whether Claude understood what it was dealing with, so my initial prompt was intentionally vague:

What is this?

Claude’s response was spot-on:

This is Jorge Arango’s personal website and blog, built with Jekyll. It’s a content-focused site featuring:

Blog posts about information architecture, UX design, and systems thinking (hundreds of posts since 2002)

“The Informed Life” podcast exploring how people organize information

Speaking and workshop information

Books section (likely featuring his published works)

Services for consulting/workshops

The site uses Jekyll as a static site generator with standard Jekyll structure (_posts, _layouts, _includes, etc.) and appears to focus on information architecture, design strategy, and digital product thinking.

This gave me enough confidence to get to the point:

Ok, good. Look through the posts in the site. I’m considering re-architecting this content to make it easier for users to find. What taxonomies could I use to describe this content?

This was Claude’s response:

A screenshot of an interaction with Claude Code that reads: Content Taxonomy ✓ Done (30 tool uses · 63.6k tokens · 3m 21.1s) • ✅ Based on my analysis of your blog posts, here are the key taxonomies I’d recommend for reorganizing your content: Primary Recommendation: Hybrid Content–Type + Subject Domain Top-Level Categories: 1. Information Architecture – Core IA theory, practice, and professional development 2. Systems & Strategy – Systems thinking, business strategy, organizational design 3. Books & Learning – Book notes, reading lists, educational resources 4. TAOI – Your established “The Architecture of Information” series 5. Tools & Methods – Practical frameworks, canvases, methodologies 6. Reflections – Quick thoughts, observations, commentary Alternative Approaches to Consider: Audience–Centered Taxonomy: – For IA Practitioners – Professional development, advanced concepts – For Design Leaders – Strategy, management, organizational change – For Learners – Foundational concepts, reading recommendations – For Technologists – Platform analysis, future trends Content Format Taxonomy: – Essays & Analysis – Deep explorations – Quick Thoughts – Daily insights and observations – Book Notes – Reviews and reading recommendations – Tools & Resources – Practical frameworks – Interviews & Media – Conversations and appearances Secondary Navigation Features: Discovery Paths: – Getting Started – Curated intro to key concepts – Popular Posts – Most-read content – Latest Thinking – Recent focus areas – Deep Dives vs Quick Insights Filter System: – Audience level (Beginner/Intermediate/Advanced) – Content length (Quick read/Medium/Deep dive) – Practical vs Theoretical focus The hybrid approach preserves your successful series (TAOI, Book Notes) while serving both casual browsers and focused researchers seeking specific IA knowledge. INSERT

Pretty good for a first try! Note this took a little over three minutes and close to 64,000 tokens — much faster and more efficient than graph RAG. But, critically, it’s also more useful than any result I’ve gotten from graph RAG.

Claude suggested a primary hybrid content type + subject domain taxonomy plus two alternatives: an audience-centered taxonomy and one centered on the content format. It also provided useful terms for all three and alternative navigation paths.

This is impressive, given that categorizing content from the top-down isn’t the tool’s primary use case. But still, this was a fairly generic request: I asked for a way to make the content easier to find but didn’t specify who would be finding content on the site. That was the focus of my next prompt:

This is good, thank you. Recently, I’ve started pivoting away from IA Practitioners and Designers toward business leaders more broadly. Also, I’ve started writing more about artificial intelligence. (Notice my new tagline: “Architecting Intelligence” — I’m arguing IA is essential for AI.) How would you change your proposed categorization schemes to support these new directions?

Again, its response exceeded my expectations:

A screenshot of an interaction with Claude Code that reads: Here is the transcription of the text from the image: ⸻ AI Strategy • Given your pivot toward business leaders and focus on AI with “Architecting Intelligence,” here’s how I’d revise the taxonomy to support these strategic directions: Revised Primary Recommendation: Business-Focused + AI-Forward Top-Level Categories: 1. Architecting Intelligence – AI/IA intersection, intelligent systems design 2. Business Strategy – Strategic value of information architecture for leaders 3. Systems Leadership – Systems thinking for executives and decision-makers 4. Learning & Insights – Books, observations, thought leadership 5. Methods & Frameworks – Practical tools for business application 6. Reflections – Quick insights and commentary Alternative Business-Centered Approach: Executive-Focused Categories: – Strategic Advantage – How IA/AI creates competitive advantage – Organizational Intelligence – Building smarter, more adaptive organizations – Digital Transformation – Information architecture in business transformation – Leadership Insights – Systems thinking for decision-makers – Future of Business – AI implications and strategic preparation Content Positioning Adjustments: Reframe Existing Content: – TAOI → “Intelligence in the Wild” – Real-world examples of good/bad information design affecting business outcomes – IA Fundamentals → “Information Strategy” – Business-focused framing – Design Process → “Strategic Process” – Decision-making and organizational methods New Content Streams to Highlight: AI-Forward Sections: – AI Strategy – How leaders should think about AI implementation – Human–AI Collaboration – Designing for augmented intelligence – Intelligent Organizations – Structural changes needed for AI integration – Information Governance – Critical foundation for AI success Business Leader Navigation: – Quick Wins – Immediate actionable insights – Strategic Frameworks – Comprehensive planning tools – Case Studies – Business transformation examples – Executive Briefings – Concise strategic overviews This repositioning maintains your established expertise while clearly signaling the business value and AI relevance to your new target audience. INSERT

In this instance, Claude recommended organizing the site according to a “Business-Focused + AI Forward” taxonomy that includes the following terms:

Architecting Intelligence - AI/IA intersection, intelligent systems design

Business Strategy - Strategic value of information architecture for leaders

Systems Leadership - Systems thinking for executives and decision-makers

Learning & Insights - Books, observations, thought leadership

Methods & Frameworks - Practical tools for business application

Reflections - Quick insights and commentary

Claude also provided alternative organization schemes and “content positioning” adjustments based on the strategic shift that is prompting this ostensible website redesign.

Now, would I use these new categorization schemes as-is? No, there’s a lot of tweaking to be done. I could also be much more precise with my prompting. For example, I haven’t told the LLM enough about the intended audience; it’s guessing about what I might mean by “business leaders” as the intended audience. But these suggestions provide a solid starting point for an IA design process — certainly better than I would’ve been able to come up with myself in under ten minutes.

There are also likely limitations. At close to 1,300 pages, jarango.com isn’t a particularly large site. I wonder how well this approach would scale to a site ten times that size. And of course, I’m dealing exclusively with text files. How would this work in a site with lots of PDFs? I expect that would require converting them to text, a process that yields mixed results.

Future Directions

It’s still early days, and there’s lots of experimentation to be done. But I’m encouraged by these first tries. Claude Code is much easier to use than GraphRAG, the tool I used for my experiments with graph RAG last year. It’s also faster and more energy- and cost-efficient.

Claude Code also has another advantage over GraphRAG: it can operate on the code itself. Although I haven’t tried it, I imagine it’d be possible to have Claude Code modify the site’s code, content, and metadata. (If you’re going to experiment with this approach, back up your code. There is no undo!)

But even if limited to read-only queries, Claude Code is an incredibly powerful tool in the information architect’s toolbox. Operating on the entire corpus (rather than individual documents) is the focus of my in-development AI product. The focus of that tool is precise control. But for “quick and dirty” operations, software development agents like Claude Code might be enough — given that you can access your content as plain text.

Using Claude Code for Information Architecture

What is Claude Code?

Use Case: Generating Draft Taxonomies

Future Directions

See also:

Subscribe to my newsletter