Python Package

Access and analyze Shedding Hub data programmatically

The shedding-hub Python package provides tools for loading datasets, performing statistical analyses, and creating visualizations. Available on PyPI, it enables reproducible research and seamless integration with your analysis workflows.

Installation

Install the package using pip:

pip install shedding-hub

For the latest development version from GitHub:

pip install git+https://github.com/shedding-hub/shedding-hub.git

Quick Start

Load a Dataset

import shedding_hub

# Load a specific dataset by identifier
data = shedding_hub.load_dataset('woelfel2020virological')

# Access dataset metadata
print(data.title)
print(data.description)
print(data.analytes)

Analyze Shedding Patterns

# Perform statistical analysis
analysis = shedding_hub.analyze(data)

# View summary statistics
print(analysis.summary())

# Get participant-level statistics
participant_stats = analysis.by_participant()

# Calculate time-course metrics
time_metrics = analysis.time_course_metrics()

Create Visualizations

import shedding_hub.viz as viz

# Plot shedding time-course for all participants
viz.plot_time_course(data)

# Create concentration distribution plots
viz.plot_concentration_distribution(data, biomarker='SARS-CoV-2')

# Generate summary dashboard
viz.create_dashboard(data, output='dashboard.html')

Features

Data Access

  • Load datasets directly from the Shedding Hub repository
  • Parse YAML data into Python objects
  • Automatic validation and type checking
  • Support for both local and remote data sources

Analysis Tools

  • Statistical analysis of shedding dynamics
  • Time-course modeling utilities
  • Participant and population-level summaries
  • Detection limit handling

Visualization

  • Time-course plots with customization options
  • Distribution and summary visualizations
  • Interactive HTML dashboards
  • Publication-ready figure export

Export Formats

  • pandas DataFrame conversion
  • CSV and JSON export
  • Integration with R via reticulate
  • Stan/JAGS model data preparation

Resources

PyPI

View package information, version history, and installation statistics.

Visit PyPI

GitHub

Source code, issue tracking, and contribution guidelines.

View on GitHub

Documentation

API reference, tutorials, and usage examples.

Read the Docs

Example Workflows

Compare Shedding Across Multiple Studies

import shedding_hub

# Load multiple datasets
datasets = [
    shedding_hub.load_dataset('woelfel2020virological'),
    shedding_hub.load_dataset('han2020sequential'),
    shedding_hub.load_dataset('kim2020viral')
]

# Perform comparative analysis
comparison = shedding_hub.compare(datasets,
                                  biomarker='SARS-CoV-2',
                                  specimen='stool')

# Generate comparison report
comparison.report(output='comparison.html')

Prepare Data for Bayesian Modeling

import shedding_hub

# Load dataset and filter for specific biomarker
data = shedding_hub.load_dataset('woelfel2020virological')
filtered = data.filter(biomarker='SARS-CoV-2', specimen='stool')

# Export for Stan
stan_data = filtered.to_stan(
    response_var='concentration',
    time_var='time_since_onset'
)

# Save for modeling
import json
with open('model_data.json', 'w') as f:
    json.dump(stan_data, f)

Export to pandas for Custom Analysis

import shedding_hub
import pandas as pd

# Load and convert to DataFrame
data = shedding_hub.load_dataset('woelfel2020virological')
df = data.to_dataframe()

# Perform custom analysis with pandas
summary = df.groupby(['participant_id', 'biomarker']).agg({
    'concentration': ['mean', 'max', 'count'],
    'time': ['min', 'max']
})

print(summary)

Contributing

We welcome contributions to the shedding-hub package! Whether you're fixing bugs, adding new features, improving documentation, or suggesting enhancements, your help is appreciated.

Please visit our Contributing Guidelines to get started. For major changes, please open an issue first to discuss what you would like to change.