Using LLMs for Data Analysis: Complete Workflow with PandasAI, Code Interpreter & OpenAI

Master LLM-powered data analysis with PandasAI, ChatGPT Code Interpreter, and OpenAI API. Build complete workflows for conversational data science.

  • MIT
  • Updated 2026-05-18

{</* resource-info */>}

Large language models have fundamentally changed how data analysts interact with information. The shift from code-first to conversation-first analysis means that complex data transformations, statistical tests, and visualizations can now be triggered through natural language prompts rather than hundreds of lines of Python or SQL. This transformation does not eliminate the need for analytical rigor โ€” it changes the interface through which analysts express their intent and validate results.

In this guide, we examine three primary tools for LLM-powered data analysis: PandasAI for conversational DataFrame operations within Python environments, ChatGPT’s Code Interpreter (now called Advanced Data Analysis) for ad-hoc exploration without coding, and the OpenAI API for building programmatic analysis pipelines. Each tool serves different personas and use cases, and understanding their strengths and limitations helps you compose a workflow that matches your technical constraints and data sensitivity requirements.

How LLMs Are Transforming Data Analysis #

Traditional data analysis follows a linear pattern: load data into a tool, write code to explore, iterate through visualizations, and synthesize findings. LLMs compress this cycle by generating code from natural language descriptions, suggesting analytical approaches based on data profiles, and even interpreting results in business context. A task that previously required ten lines of Pandas code โ€” filtering a DataFrame, grouping by category, computing aggregates, and sorting โ€” now executes from a single sentence: “Show me total sales by region for Q1 2026, sorted highest to lowest.”

The current generation of LLMs handles several analytical tasks with high reliability. Natural language to SQL translation works well for straightforward queries on well-schemas databases. Automated visualization generation produces appropriate chart types based on data characteristics and the question being asked. Data cleaning suggestions identify missing values, outliers, and formatting inconsistencies. Insight generation flags trends, correlations, and anomalies that human analysts might overlook in large datasets.

However, LLMs carry specific limitations in analytical contexts. Hallucination โ€” generating plausible but incorrect code or conclusions โ€” remains the primary risk. An LLM might invent a column name, misapply a statistical test, or misinterpret a correlation as causation. Current models also struggle with very large datasets (beyond token context windows), complex multi-step reasoning requiring precise intermediate values, and domain-specific analytical conventions that vary across industries.

PandasAI: Conversational DataFrame Operations #

PandasAI, an open-source library maintained on GitHub, adds generative AI capabilities directly to Pandas DataFrames. Instead of writing df.groupby('region')['sales'].sum().sort_values(ascending=False), you write df.chat("What are total sales by region, sorted highest to lowest?") and PandasAI generates the appropriate code, executes it, and returns the result. The library abstracts away prompt engineering and code generation, presenting a clean interface that feels like an extension of Pandas itself.

Behind the scenes, PandasAI constructs detailed prompts that include DataFrame schemas (column names, types, sample values), conversation history, and the user’s question. The LLM generates Python code which PandasAI executes in a controlled environment, with options for Docker sandboxing to isolate potentially harmful operations. Results format automatically based on query type โ€” tables for data retrieval, matplotlib or seaborn plots for visualization requests, and natural language summaries for analytical questions.

PandasAI Setup and Basic Usage #

Installation requires Python 3.9+ and a single pip command:

pip install pandasai

Basic usage wraps any Pandas DataFrame with PandasAI’s SmartDataFrame or Agent interfaces:

import pandas as pd
from pandasai import SmartDataFrame

# Load your data
df = pd.read_csv("sales_data.csv")

# Wrap with PandasAI
sdf = SmartDataFrame(df, config={"llm": "openai"})

# Ask questions in natural language
result = sdf.chat("What is the correlation between marketing spend and revenue?")
print(result)  # Returns correlation coefficient + interpretation

The library supports multiple LLM backends: OpenAI’s GPT models, local models through Ollama or LM Studio, Google’s models, and PandasAI’s own BambooLLM โ€” a specialized model fine-tuned for data analysis tasks. Switching backends requires only changing the configuration dictionary, making it easy to move between cloud and local deployments based on privacy requirements.

Advanced PandasAI Features #

Beyond simple queries, PandasAI offers several capabilities for complex analytical workflows. The Agent mode maintains conversation context across multiple questions, enabling follow-up queries that reference previous results. Custom instructions inject domain-specific guidance โ€” telling the model to always apply seasonal adjustments to time series data, for example. Skill definitions let you register custom Python functions that the LLM can call, extending its analytical capabilities with your proprietary calculations.

Database connectivity allows PandasAI to execute natural language queries against SQL databases through SQLAlchemy connections. This feature proves particularly valuable for analysts who understand business questions but struggle with SQL syntax. The library generates the SQL query, executes it, and returns formatted results. For teams with strict data governance, PandasAI supports local LLM execution through Ollama โ€” no data leaves your infrastructure, addressing the privacy concerns that prevent many organizations from using cloud LLMs.

ChatGPT Code Interpreter / Advanced Data Analysis #

OpenAI’s Code Interpreter, rebranded as Advanced Data Analysis in late 2023, provides the most accessible entry point for non-programmers doing data analysis. Available to ChatGPT Plus and Enterprise subscribers, this feature combines a GPT-4-class model with a sandboxed Python execution environment. Users upload files (CSV, Excel, JSON, images, PDFs), ask analytical questions in natural language, and receive executed code, visualizations, and interpreted results.

The workflow is intentionally frictionless. Upload a CSV file, and Code Interpreter automatically inspects it โ€” showing column names, data types, summary statistics, and a preview of the first rows. From there, analysis proceeds conversationally: “Plot a histogram of the price column,” “What is the correlation between age and purchase amount?,” “Detect any outliers in the transaction volume.” The model generates Python code using Pandas, Matplotlib, Seaborn, and SciPy, executes it, and presents results with explanatory text.

Practical Code Interpreter Workflows #

Effective Code Interpreter usage follows a structured pattern. Start with data validation โ€” verify that the model correctly identified column types and flag any misinterpretations. Next, request exploratory analysis: distributions, correlations, and missing value patterns. Frame analytical questions precisely โ€” “What drives customer churn?” yields better results than “Analyze this data” because it directs the model toward relevant techniques (survival analysis, feature importance, segmentation).

For large files, Code Interpreter has practical limits. Files over 100 MB may fail to upload or process slowly, and complex operations on multi-million-row datasets can timeout. In these cases, request sampling strategies โ€” “Analyze a representative 10% sample first, then validate on the full dataset.” The model can generate code for processing large files in chunks, though execution time constraints may require moving to local Python environments for production-scale work.

OpenAI API for Programmatic Data Analysis #

For production pipelines and custom applications, the OpenAI API provides programmatic access to the same underlying capabilities. The key technique is function calling (now called tool use), which lets you define structured output schemas that the model populates. This transforms free-text responses into parseable data structures suitable for downstream processing.

A typical programmatic workflow combines the OpenAI API with code execution in your environment. You send the API a data schema and analytical question, receive generated Python code, execute it locally (where you have access to full datasets and libraries), and feed results back for interpretation. This pattern โ€” generation, execution, interpretation โ€” forms the core of most LLM-powered analysis pipelines.

The Assistants API extends this with persistent threads, file attachments, and built-in tools including a Code Interpreter. You create an assistant with instructions like “You are a data analyst specializing in financial data,” upload files for context, and users interact through threaded conversations that maintain state. For batch processing large numbers of datasets, the Batch API offers 50% cost reductions with 24-hour turnaround โ€” ideal for overnight processing of survey responses, transaction logs, or sensor data.

Building a Complete LLM-Powered Analysis Pipeline #

Production-grade LLM analysis requires more than prompting a model. A robust architecture layers human oversight between generation and action:

StageComponentPurpose
IngestionData loading + validationEnsure schema compliance and quality checks
PreprocessingLLM-generated cleaning codeHandle missing values, type conversion, formatting
ValidationHuman review of generated codeCatch hallucinations and domain errors
ExecutionControlled runtime (Docker/sandbox)Isolate execution for security
AnalysisLLM-generated statistics/visualizationsExtract insights from processed data
ReviewHuman interpretation + verificationValidate conclusions against domain knowledge
OutputFormatted reports + structured exportsDeliver actionable results to stakeholders

This pipeline treats the LLM as a code generator and pattern recognizer, not an oracle. Every generated code block undergoes human review before execution on production data. Validation steps catch the hallucination errors that inevitably occur โ€” a misspelled column name, an inappropriate statistical test, a visualization that misrepresents the data. The combination of LLM speed and human judgment produces reliable results faster than either alone.

Security, Privacy, and Cost Considerations #

Data privacy represents the primary barrier to LLM adoption in data analysis. When you upload a customer database to a cloud LLM, that data may be used for model training, stored temporarily for processing, or accessible to the service provider’s staff. Different tools offer different privacy guarantees.

PandasAI with local LLMs (Ollama, LM Studio) keeps all data on your hardware โ€” no network transmission occurs. This is the only option suitable for healthcare records, financial transaction data subject to PCI-DSS, or any dataset containing personally identifiable information under GDPR or CCPA. The tradeoff is model capability: local models as of early 2026 generally underperform GPT-4 on complex analytical reasoning, though the gap narrows with each release.

ChatGPT Code Interpreter and the OpenAI API transmit data to OpenAI’s servers. OpenAI’s enterprise terms state that data sent through the API is not used for training, but organizational compliance teams often require additional legal review. For sensitive data, the API supports zero-data-retention agreements on enterprise contracts.

Cost estimation varies by workload. ChatGPT Plus costs $20/month for unlimited Code Interpreter usage โ€” exceptional value for individual analysts. API costs scale with token usage: a typical analytical conversation with a 50,000-row dataset might consume $0.50-$2.00 in tokens. Production pipelines processing thousands of files monthly can reach hundreds of dollars. Local LLMs eliminate per-request costs but require GPU hardware investments ($500-$5,000 depending on model size and throughput needs).

Comparison: When to Use Which Tool #

Use CasePandasAICode InterpreterOpenAI APILocal LLM
Jupyter notebook workflowsExcellentNot applicableGood (code generation)Good
Non-programmer analystsRequires Python basicsExcellentRequires engineeringRequires setup
Production pipelinesPossible (scripted)Not suitableExcellentExcellent
Sensitive/regulated dataGood (with local LLM)PoorModerate (enterprise terms)Excellent
Ad-hoc explorationGoodExcellentModerateModerate
Custom applicationsModerateNot suitableExcellentGood
Cost efficiency (high volume)Good (local)Excellent (Plus sub)ModerateExcellent

PandasAI serves Python-literate analysts who want AI augmentation within their existing Jupyter or IDE workflows. ChatGPT Code Interpreter democratizes data analysis for spreadsheet users who never learned Python. The OpenAI API enables engineers to build custom analysis applications with full control over the user experience. Local LLMs address the privacy requirements that prevent cloud adoption in regulated industries.

The Future of AI-Assisted Data Science #

The tools described in this guide represent early-stage capabilities. Emerging developments point toward more autonomous and integrated analytical systems. Multi-agent frameworks like AutoGen and CrewAI orchestrate teams of specialized AI agents โ€” one for data loading, another for statistical testing, a third for visualization โ€” that collaborate on complex analysis tasks. Early implementations show promise for end-to-end report generation, though human oversight remains essential for quality assurance.

Autonomous data science competitions (often called AutoKaggle or DS-Agent) attempt to solve complete Kaggle-style problems from raw data submission to leaderboard entry without human intervention. Current systems reach bronze-to-silver medal performance on straightforward tabular competitions but struggle with creative feature engineering and domain-specific insights. The trajectory suggests these tools will handle routine analytical tasks within two to three years, freeing human analysts for higher-level strategic work.

Integration with existing BI tools accelerates as well. Microsoft Copilot in Power BI, Tableau’s Ask Data feature, and similar capabilities in Looker and Mode Analytics embed LLM capabilities directly where business users already consume data. The boundary between “data analyst” and “business user with AI assistance” continues to blur.

Frequently Asked Questions #

Can LLMs replace data analysts? #

No. LLMs excel at accelerating mechanical analysis tasks โ€” generating code, producing visualizations, and identifying patterns. They do not replace the domain expertise that frames meaningful questions, the business judgment that prioritizes which findings matter, or the communication skills that translate technical results into organizational action. The most effective analysts in 2026 use LLMs to automate routine work while focusing their human expertise on problem formulation, result validation, and strategic recommendation. Organizations attempting to eliminate analyst roles find that LLM-generated analysis lacks the contextual grounding necessary for consequential decisions.

Is PandasAI free to use? #

The PandasAI library itself is open-source and free. However, using it with OpenAI’s GPT models requires an API key and incurs per-token charges. A typical analysis session with a moderately sized dataset costs between $0.10 and $1.00 in API fees. PandasAI’s BambooLLM offers a freemium tier with usage limits, and running local models through Ollama is entirely free beyond the hardware electricity cost. For teams processing hundreds of datasets monthly, local LLM deployment becomes cost-competitive within weeks.

How accurate is ChatGPT at data analysis? #

Accuracy depends heavily on task complexity and how clearly you phrase your questions. For descriptive statistics, filtering, grouping, and standard visualizations, accuracy exceeds 90%. Statistical test selection and interpretation hover around 75-80% โ€” the model occasionally recommends inappropriate tests or misinterprets p-values. Complex multi-step reasoning involving precise intermediate calculations fails more often, perhaps 40-50% accuracy for convoluted analytical chains. Always verify critical results independently, and treat LLM-generated analysis as a first draft rather than a final answer.

Can I use local LLMs for sensitive data analysis? #

Yes, and this is increasingly the preferred approach for regulated industries. Models like Llama 3.3 (70B), Mistral Large, and Qwen 2.5 run locally through Ollama or vLLM and deliver analytical capabilities sufficient for most business intelligence tasks. The performance gap versus GPT-4 is real but narrowing โ€” local models handle 80-90% of routine analysis tasks competently. For the remaining 10-20% requiring sophisticated reasoning, you can architect hybrid systems where sensitive data preprocessing happens locally, and only aggregated, anonymized results go to cloud APIs for higher-level synthesis.

What are the costs of using OpenAI API for data analysis? #

API costs scale with model choice and usage volume. GPT-4o (the current default for analysis) costs $2.50 per million input tokens and $10.00 per million output tokens as of early 2026. A typical analytical session โ€” loading a dataset schema, asking 10 questions with moderate responses โ€” consumes roughly 50,000-150,000 tokens total, costing $0.50-$2.00. The Assistants API with Code Interpreter tool adds a $0.03 per session code execution fee. For high-volume workloads, the Batch API offers 50% discounts with 24-hour latency. Most individual analysts spend under $50/month; small teams under $500/month; enterprise deployments vary widely based on dataset sizes and query volumes.


To run any of the tools above reliably 24/7, infrastructure matters:

  • DigitalOcean โ€” $200 free credit, 14+ global regions, one-click droplets for AI/dev workloads.
  • HTStack โ€” Hong Kong VPS with low latency for mainland China access. This is the same IDC hosting dibi8.com โ€” production-proven.

Affiliate links โ€” no extra cost to you, helps keep dibi8.com running.

๐Ÿ’ฌ Discussion