Statistical and Neural Machine Translation
Access datasets and resources for statistical and neural machine translation research, including yearly releases and tools for language processing projects.
Download machine translation research datasets
This website is a resource hub for researchers and developers interested in machine translation. Here, you can find a wide range of datasets and tools focused on statistical and neural approaches to language translation, with organized releases from multiple years.
Whether you're building translation models, analyzing language data, or looking for benchmarks to test your algorithms, this site offers curated datasets and links to essential tools like Moses. It's especially useful if you're working on academic or industry projects in natural language processing or AI.
With its straightforward layout and focus on sharing open research data, the site makes it easy to access and download what you need for your next machine translation project.
Discover websites similar to Data.statmt.org. Optimized for ultra-fast loading.
VIAF links name authority files from libraries worldwide, making it easy to find and connect information about authors and organizations in one place.
Explore fossil records and paleobiological data with the Paleobiology Database, a scientific resource for researchers and enthusiasts worldwide.
Access global nuclear data, research tools, and resources from the IAEA for science, energy, medicine, and safety applications worldwide.
ImageNet is a large research database of labeled images, widely used for computer vision and AI research, maintained by Stanford and Princeton University.
Explore comprehensive gene function data and annotations to support biological research, with tools for accessing, browsing, and contributing to gene ontology.
PharmGKB is a pharmacogenomics research database with curated information on how genetic variation affects drug response and treatment outcomes.
Explore how environmental chemicals impact human health with CTD, a research database linking chemicals, genes, and diseases for scientific discovery.
Explore protein domains, families, and functional sites with PROSITE, a research database for identifying protein patterns and profiles in biology.
BindingDB is a database of measured protein–ligand binding affinities, helping researchers find and analyze chemical and biological interaction data.
BioGRID is a free online database offering searchable and downloadable data on protein, genetic, and chemical interactions across major model organisms.
Explore and access a wide range of biomedical ontologies and vocabularies to support research, data sharing, and collaboration in life sciences.
Explore detailed data on all human proteins in cells and tissues, with powerful search and visualization tools for research and discovery.
Explore a comprehensive genome taxonomy database for bacteria and archaea, offering standardized classification and phylogenetic tools for researchers.
Explore and access public research data on human brain connections, including studies on brain health, development, aging, and related diseases.
MycoBank is an online database for fungal taxonomy, offering detailed information about fungi species, names, and related scientific data.
Access open, curated databases of microbial genome sequences and typing data for over 140 species, including provenance and phenotype information.
FAIRsharing helps you find, share, and cite data standards, databases, and policies for research across disciplines.
miRBase is an online database for searching and browsing microRNA sequences, letting you explore and download genetic information by organism or chromosome.
Databrary is an open research data library for sharing, storing, and reusing video and related data collected for developmental and behavioral science studies.
Explore a global database of algae species with scientific data, images, and literature references for research, education, or personal interest.
Explore detailed microarray gene expression data from the human brain, including interactive heatmaps and search tools for neuroscience research.
Explore detailed data on protein-coding gene families, their functions, and genetic variations to support biomedical and evolutionary research.
BioNames connects taxonomic names to original descriptions, taxa, and phylogenies, making it easy to explore scientific naming information online.
Explore a global database of fungal species, offering taxonomic information and datasets for researchers, scientists, and anyone interested in mycology.
Access cross-national microdata for research and analysis with remote tools from the LIS Data Center in Luxembourg. Ideal for social science studies.
Explore DNA barcode data to identify species and access tools, resources, and research materials for biodiversity and taxonomy studies worldwide.
Explore a specialized database of bitter compounds, their structures, and related scientific data. Designed for researchers in biochemistry and food science.
Explore integrated cross-species gene, variant, and disease data to support research and discovery in genetics and biomedical science.
BRENDA is a comprehensive enzyme database offering detailed biochemical, molecular, and functional data for researchers in biology and life sciences.
openICPSR lets you share and access behavioral health and social science research data for free, supporting open science and public research access.
Access a vast archive of public opinion polls and survey data from the United States and worldwide for research, teaching, or informed decision-making.
Search and explore information about ongoing and completed clinical trials worldwide, including study details, results, and participation options.
Explore millions of chemical compounds, properties, and research data in the world’s largest free chemistry database. Ideal for scientists and students.
Find detailed information on hazardous chemicals, their properties, and potential risks using this searchable database from NOAA.
Explore detailed information on human metabolites, diseases, proteins, and pathways with this comprehensive research database for metabolomics studies.
Explore protein structures predicted by AlphaFold. Search, browse, and download detailed 3D models to support scientific research and discovery.
CIDeRplus is a curated database of disease-related biomolecule interactions, designed for scientists and bioinformatics research. Available in English.
Explore global species taxonomy with ITIS, an open database offering scientific names and classifications for plants, animals, and other organisms.
Explore protein structures and classifications with CATH, a database for researchers studying protein sequences, structures, and their relationships.
Explore and access a wide range of research outputs, data, and publications from Springer Nature in this user-friendly online research repository.
Discover tools and services similar to data.statmt.org
Explore related tools and services in these categories