# DeepChem — Deep Learning for Drug Discovery and Chemistry > DeepChem is a Python library that provides tools for applying deep learning to drug discovery, materials science, quantum chemistry, and biology with built-in molecular featurizers, datasets, and model architectures. ## Install Save in your project root: # DeepChem — Deep Learning for Drug Discovery and Chemistry ## Quick Use ```bash pip install deepchem ``` ```python import deepchem as dc # Load a molecular dataset and train a graph convolution model tasks, datasets, transformers = dc.molnet.load_delaney(featurizer='GraphConv') train, valid, test = datasets model = dc.models.GraphConvModel(n_tasks=1, mode='regression') model.fit(train, nb_epoch=50) print(model.evaluate(test, [dc.metrics.Metric(dc.metrics.pearson_r2_score)])) ``` ## Introduction DeepChem is an open-source library that democratizes deep learning for the life sciences and chemistry. It provides molecular featurizers (SMILES, fingerprints, graph convolutions), curated benchmark datasets (MoleculeNet), and model implementations for predicting molecular properties, protein-ligand binding, toxicity, and more. DeepChem bridges the gap between chemistry domain knowledge and modern ML tooling. ## What DeepChem Does - Molecular property prediction using graph neural networks and fingerprints - Virtual screening and drug-target interaction modeling - Quantum chemistry property prediction - Material property prediction for inorganic compounds - Protein-ligand docking score prediction ## Architecture Overview DeepChem wraps TensorFlow, PyTorch, and JAX backends with a unified Model API. The data pipeline converts raw molecules (SMILES strings, SDF files) into featurized datasets using fingerprint, graph, or Coulomb matrix featurizers. MoleculeNet provides standardized benchmark datasets with train/valid/test splits. Models range from simple fully-connected networks to graph convolutional networks, attention-based architectures, and normalizing flows for generative chemistry. ## Self-Hosting & Configuration - Install via pip: `pip install deepchem` - Requires Python 3.8+, NumPy, and either TensorFlow or PyTorch - RDKit recommended for molecular featurization (install via conda) - No GPU required but recommended for training deep models - MoleculeNet datasets download automatically on first use ## Key Features - MoleculeNet: a curated collection of molecular benchmark datasets - Molecular featurizers for fingerprints, graphs, Coulomb matrices, and more - Multi-backend support: TensorFlow, PyTorch, and JAX - Pre-built model architectures for common chemistry ML tasks - Tutorials covering drug discovery, materials, and genomics applications ## Comparison with Similar Tools - **RDKit** — cheminformatics toolkit for molecular manipulation; DeepChem adds deep learning on top - **PyTorch Geometric** — general graph neural networks; DeepChem is domain-specialized for chemistry - **DGL-LifeSci** — DGL's life science module; DeepChem has broader task and dataset coverage - **SchNet/DimeNet** — specific architectures; DeepChem bundles multiple architectures with a unified API ## FAQ **Q: Do I need chemistry expertise to use DeepChem?** A: Basic understanding of SMILES notation helps, but DeepChem's tutorials walk you through drug discovery workflows step by step. **Q: Can DeepChem generate new molecules?** A: Yes. DeepChem includes generative models like normalizing flows and reinforcement learning-based molecule generation. **Q: What is MoleculeNet?** A: MoleculeNet is a benchmark suite of molecular datasets with standardized splits and metrics, designed for fair comparison of molecular ML methods. **Q: Does DeepChem work with proteins?** A: Yes. DeepChem supports protein-ligand interaction prediction and includes featurizers for protein sequences. ## Sources - https://github.com/deepchem/deepchem - https://deepchem.io/ --- Source: https://tokrepo.com/en/workflows/asset-4b2b4e9b Author: AI Open Source