What is structured and unstructured data in AI?
Understand the difference and why it matters with this simple guide.
Glenn Jaume
Product Manager at Coda
AI · 4 min read
What is structured and unstructured data?
In the context of AI models, structured and unstructured data can refer to both the information it can access and the format of the responses it delivers (i.e., a text response, an image, a table, etc.) At its most simple:- Structured data is information that is highly organized and formatted in a way that is easily stored and searchable in relational databases (e.g., in rows and columns). Think of a spreadsheet of financial records or a table of customer information from your CRM.
- Unstructured data does not have a pre-defined format and cannot easily fit into a traditional database of columns and rows. Think written content like docs, messages, and emails, or media like videos and images.
Is unstructured or structured data better in AI?
The simple answer is that both structured and unstructured data have their place in AI, and are suited to different use cases:- Structured data is typically best for AI tasks like classification and bulk categorization, data analysis, or data retrieval (e.g., showing me all my current sales opportunities over $10k).
- Unstructured data is more suited to AI tasks like natural language processing (e.g., sentiment analysis), synthesizing or finding answers from large volumes of text, describing visual media, or transcribing voice to text.
3 examples of structured and unstructured data in AI.
When using enterprise AI, you will likely need both structured and unstructured data for different use cases—sometimes it will be one or the other, and sometimes both for a single task. Let’s take a look at what that could look like in reality.Writing a product proposal.
You’re a product manager creating a write-up for a proposed roadmap feature. You want to gather relevant internal context and data that exists about this feature to help make your case.- Structured data: You want to include evidence for why you should prioritize building this feature now, so you ask AI what recent deals you’ve lost due to this feature not existing. The AI returns a table of lost deals from Salesforce, filtered to include those mentioning this feature. You insert this directly into your write-up.
- Unstructured data: You ask AI if this feature has been explored before. In response, it looks through your internal docs, Slack channels, and emails, and returns a summary of previous explorations and the reasons they didn’t continue. You add this context to your write-up to provide some background.
Sending a project update.
You’re a project lead and want to send an update on progress in the last week.- Structured data: You also want to include an overview of progress, so ask AI for project tasks that are completed, blocked, and on track. The AI returns a filtered table of tasks from Asana, so you can insert it into the update without having to link out or screenshot your tasks.
- Unstructured data: You ask AI to summarize that week’s stakeholder meeting. After analyzing your meeting notes, it provides a brief summary of the main decisions and action items, which you insert into your update.
Planning your next sprint.
You’re an engineering leader working on a new feature and are planning what needs to be prioritized in the next sprint.- Structured data: You also want to see progress on this project so far, so ask AI for an update. You receive back a view of all the related issues from Jira, including how complete they are, so you get an overview without switching between tools.
- Unstructured data: You want to check the agreed scope for the feature, so you ask AI to show you the product requirements. It retrieves them from your internal documents and summarizes them for you.