Jupyter Notebook Integration

Overview

Daytona sandboxes provide secure, isolated environments for running Jupyter notebooks with full Python capabilities, package management, and data processing - perfect for interactive data science, machine learning, and analysis workflows.

Getting Started

Quick Setup

Launch a Jupyter notebook server in a Daytona sandbox:

import { createSandbox } from '@daytona/sdk';

// Create sandbox and install Jupyter
const sandbox = await createSandbox({
  name: 'jupyter-env',
  public: true,  // Enable preview links
});

// Install Jupyter and common data science packages
await sandbox.exec(`
  pip install jupyter notebook pandas numpy matplotlib seaborn scikit-learn
`);

// Start Jupyter notebook server
const server = await sandbox.exec(
  'jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root',
  { background: true }
);

// Get preview URL
const notebookUrl = sandbox.getPreviewUrl(8888);
console.log(`Jupyter Notebook: ${notebookUrl}`);
// Example: https://8888-abc123.proxy.daytona.works

Python SDK

from daytona_sdk import Daytona
import os

client = Daytona(api_key=os.getenv('DAYTONA_API_KEY'))

# Create sandbox
sandbox = client.create_sandbox(
    name='jupyter-env',
    public=True
)

# Install Jupyter
sandbox.exec('pip install jupyter notebook pandas matplotlib')

# Start server
sandbox.exec(
    'jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root',
    background=True
)

print(f'Jupyter: {sandbox.get_preview_url(8888)}')

AI-Powered Notebooks

Automated Notebook Generation

Use AI agents to generate and execute Jupyter notebooks:

import openai
from daytona_sdk import Daytona

client = openai.OpenAI()
darytona = Daytona()

# Generate notebook code with AI
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": """
        Create a Jupyter notebook for exploratory data analysis:
        1. Load CSV data
        2. Show summary statistics
        3. Create visualizations
        4. Identify correlations
        
        Output as JSON cells format.
        """
    }]
)

notebook_cells = response.choices[0].message.content

# Create sandbox and upload notebook
sandbox = daytona.create_sandbox(public=True)
sandbox.upload_file(
    '/home/daytona/analysis.ipynb',
    notebook_cells.encode()
)

# Start Jupyter
sandbox.exec('pip install jupyter pandas matplotlib seaborn')
sandbox.exec(
    'jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser',
    background=True
)

print(f'Open: {sandbox.get_preview_url(8888)}')

Interactive Analysis with LangChain

Combine Jupyter with LangChain for AI-assisted data analysis:

from langchain_anthropic import ChatAnthropic
from langchain_daytona_data_analysis import DaytonaDataAnalysisTool
from langchain.agents import AgentExecutor, create_tool_calling_agent

# Initialize components
llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
tool = DaytonaDataAnalysisTool()

# Upload dataset to sandbox
with open('dataset.csv', 'rb') as f:
    tool.upload_file(
        file=f,
        description="Sales data with columns: date, product, revenue, quantity"
    )

# Create analysis agent
agent = create_tool_calling_agent(llm, [tool], prompt)
executor = AgentExecutor(agent=agent, tools=[tool])

# Run interactive analysis
queries = [
    "Show me summary statistics for the dataset",
    "What are the top 5 products by revenue?",
    "Create a time series plot of daily revenue",
    "Identify any seasonality patterns",
]

for query in queries:
    print(f"\n> {query}")
    result = executor.invoke({"input": query})
    print(result["output"])

# Results are generated in sandbox, can be viewed in Jupyter

Advanced Configurations

JupyterLab Setup

Run the more feature-rich JupyterLab interface:

await sandbox.exec('pip install jupyterlab');

await sandbox.exec(
  'jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root',
  { background: true }
);

const labUrl = sandbox.getPreviewUrl(8888);
console.log(`JupyterLab: ${labUrl}`);

Custom Kernel Installation

Install additional Jupyter kernels:

# R kernel
await sandbox.exec(`
  apt-get update && apt-get install -y r-base
  R -e "install.packages('IRkernel')"
  R -e "IRkernel::installspec(user = FALSE)"
`);

# Julia kernel
await sandbox.exec(`
  wget https://julialang-s3.julialang.org/bin/linux/x64/1.9/julia-1.9.0-linux-x86_64.tar.gz
  tar xvf julia-1.9.0-linux-x86_64.tar.gz
  ./julia-1.9.0/bin/julia -e 'using Pkg; Pkg.add("IJulia")'
`);

Data Science Environment

Create a fully-featured data science environment:

# Install comprehensive package set
packages = [
    # Core
    'jupyter', 'jupyterlab', 'notebook',
    
    # Data manipulation
    'pandas', 'numpy', 'scipy',
    
    # Visualization
    'matplotlib', 'seaborn', 'plotly', 'bokeh',
    
    # Machine learning
    'scikit-learn', 'xgboost', 'tensorflow', 'pytorch',
    
    # Statistics
    'statsmodels', 'scipy',
    
    # Utils
    'openpyxl', 'xlrd', 'requests', 'beautifulsoup4',
]

sandbox.exec(f'pip install {" ".join(packages)}')

GPU Support

For machine learning workloads requiring GPU:

const gpuSandbox = await createSandbox({
  name: 'ml-notebook',
  resources: {
    gpu: '1',  // Request GPU
    memory: '16Gi',
    cpu: '4',
  },
  public: true,
});

// Install CUDA-enabled packages
await gpuSandbox.exec(`
  pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
  pip install tensorflow[and-cuda]
`);

Common Workflows

Data Upload and Processing

// Upload datasets to sandbox
await sandbox.uploadFile(
  './local-data.csv',
  '/home/daytona/data.csv'
);

await sandbox.uploadFile(
  './analysis.ipynb',
  '/home/daytona/notebooks/analysis.ipynb'
);

// Install requirements
await sandbox.uploadFile(
  './requirements.txt',
  '/home/daytona/requirements.txt'
);
await sandbox.exec('pip install -r requirements.txt');

Automated Execution

Execute notebooks programmatically:

# Install nbconvert for execution
sandbox.exec('pip install nbconvert')

# Execute notebook
result = sandbox.exec(
    'jupyter nbconvert --to notebook --execute analysis.ipynb --output executed.ipynb'
)

# Download results
sandbox.download_file(
    '/home/daytona/executed.ipynb',
    './results/executed.ipynb'
)

# Download generated artifacts
sandbox.download_file(
    '/home/daytona/output.png',
    './results/output.png'
)

Collaborative Notebooks

Share notebook environments with teams:

// Create long-lived notebook environment
const sharedSandbox = await createSandbox({
  name: 'team-analysis',
  public: true,
  timeout: '24h',  // Keep alive for 24 hours
});

// Setup environment
await setupJupyterEnvironment(sharedSandbox);

// Share URL with team
const notebookUrl = sharedSandbox.getPreviewUrl(8888);
console.log(`Share this URL: ${notebookUrl}`);

// Optional: Set password protection
await sharedSandbox.exec(
  `jupyter notebook password`,
  { input: 'team-password\nteam-password\n' }
);

Scheduled Notebook Runs

Schedule notebook execution for reporting:

import schedule
import time
from daytona_sdk import Daytona

def run_daily_report():
    """Execute daily analysis notebook"""
    daytona = Daytona()
    sandbox = daytona.create_sandbox()
    
    try:
        # Upload latest data
        sandbox.upload_file('./daily-data.csv', '/data.csv')
        
        # Execute notebook
        sandbox.exec('pip install jupyter nbconvert pandas matplotlib')
        sandbox.exec(
            'jupyter nbconvert --to html --execute report.ipynb'
        )
        
        # Download report
        sandbox.download_file(
            '/home/daytona/report.html',
            f'./reports/report-{date.today()}.html'
        )
        
    finally:
        sandbox.delete()

# Schedule daily at 9 AM
schedule.every().day.at("09:00").do(run_daily_report)

while True:
    schedule.run_pending()
    time.sleep(60)

Integration with Data Analysis Agents

Combine Jupyter with AI coding agents for enhanced workflows:

import dspy
from daytona_interpreter import DaytonaInterpreter

# Create interpreter for Jupyter environment
interpreter = DaytonaInterpreter(
    packages=['jupyter', 'pandas', 'matplotlib', 'seaborn']
)

# Create RLM that can use Jupyter-like REPL
lm = dspy.LM("openrouter/anthropic/claude-3.5-sonnet")
dspy.configure(lm=lm)

rlm = dspy.RLM(
    signature="data_question -> analysis: str",
    interpreter=interpreter,
    verbose=True,
)

# RLM executes Python code iteratively
result = rlm(
    data_question="""
    Load the sales data, calculate monthly trends,
    and create visualizations showing growth patterns.
    """
)

print(result.analysis)
interpreter.shutdown()

Reference: DSPy RLM Integration

Best Practices

Resource Management

// Set appropriate timeouts
const sandbox = await createSandbox({
  timeout: '2h',  // Auto-cleanup after 2 hours
  resources: {
    memory: '8Gi',  // Sufficient for data processing
    cpu: '4',
  },
});

// Clean up when done
try {
  await runNotebookWorkflow(sandbox);
} finally {
  await sandbox.delete();
}

Security

# Use tokens for authentication
sandbox.exec(
    'jupyter notebook --NotebookApp.token="secure-token-here"',
    background=True
)

# Or disable auth for internal sandboxes only
sandbox.exec(
    'jupyter notebook --NotebookApp.token="" --NotebookApp.password=""',
    background=True
)

Performance

# Pre-install common packages in image
# Create custom sandbox image with packages pre-installed
# to reduce startup time

# Use persistent storage for large datasets
sandbox.mount_volume(
    source='data-volume',
    target='/home/daytona/data'
)

Troubleshooting

Notebook Server Not Starting

# Check if port is available
await sandbox.exec('netstat -tuln | grep 8888');

# View Jupyter logs
await sandbox.exec('jupyter notebook list');

Kernel Issues

# List available kernels
await sandbox.exec('jupyter kernelspec list');

# Reinstall kernel
await sandbox.exec('python -m ipykernel install --user');

Package Installation Failures

# Update pip first
await sandbox.exec('pip install --upgrade pip');

# Use specific package versions
await sandbox.exec('pip install pandas==2.0.0');

Data Analysis - AI-powered data workflows
AI Coding Agents - Automated development
Python SDK - Sandbox control via Python
Network Configuration - Access notebook servers via port previews

Code Examples

Integration Guides

Jupyter Notebook Integration

Overview

Getting Started

Quick Setup

Python SDK

AI-Powered Notebooks

Automated Notebook Generation

Interactive Analysis with LangChain

Advanced Configurations

JupyterLab Setup

Custom Kernel Installation

Data Science Environment

GPU Support

Common Workflows

Data Upload and Processing

Automated Execution

Collaborative Notebooks

Scheduled Notebook Runs

Integration with Data Analysis Agents

Best Practices

Resource Management

Security

Performance

Troubleshooting

Notebook Server Not Starting

Kernel Issues

Package Installation Failures

Code Examples

Integration Guides

Documentation Index

​Overview

​Getting Started

​Quick Setup

​Python SDK

​AI-Powered Notebooks

​Automated Notebook Generation

​Interactive Analysis with LangChain

​Advanced Configurations

​JupyterLab Setup

​Custom Kernel Installation

​Data Science Environment

​GPU Support

​Common Workflows

​Data Upload and Processing

​Automated Execution

​Collaborative Notebooks

​Scheduled Notebook Runs

​Integration with Data Analysis Agents

​Best Practices

​Resource Management

​Security

​Performance

​Troubleshooting

​Notebook Server Not Starting

​Kernel Issues

​Package Installation Failures

​Related Resources

Overview

Getting Started

Quick Setup

Python SDK

AI-Powered Notebooks

Automated Notebook Generation

Interactive Analysis with LangChain

Advanced Configurations

JupyterLab Setup

Custom Kernel Installation

Data Science Environment

GPU Support

Common Workflows

Data Upload and Processing

Automated Execution

Collaborative Notebooks

Scheduled Notebook Runs

Integration with Data Analysis Agents

Best Practices

Resource Management

Security

Performance

Troubleshooting

Notebook Server Not Starting

Kernel Issues

Package Installation Failures

Related Resources