Weka offers open source machine learning tools in Java for data mining, analysis, and visualization, making it easy to explore and model data sets.
Explore data with open source machine learning tools
Weka is an open source platform that provides a suite of machine learning tools built in Java. With Weka, you can analyze and visualize data, build predictive models, and experiment with a wide range of algorithms—all from an easy-to-use interface.
Whether you’re a student learning about data mining or a professional working with large data sets, Weka helps you dig into your data and uncover patterns. It’s designed to be accessible for beginners while still offering the depth and flexibility needed by more advanced users. If you’re interested in exploring machine learning without complicated setup, Weka is a great place to start.
Discover websites similar to Weka.sourceforge.io. Section 1 prioritizes sites with matching domain extensions and/or languages. Section 2 offers worldwide alternatives.
ELKI is an open-source Java framework for data mining, focusing on clustering and outlier detection with extensible algorithms and benchmarking tools.
Explore LightGBM’s official documentation for guides, tutorials, and API references on this fast, distributed gradient boosting framework for machine learning.
Explore YDF documentation to learn how to train, evaluate, and deploy decision forest models like Random Forests using this open-source machine learning library.
Explore XGBoost's official documentation for setup guides, tutorials, and detailed info on this popular machine learning library and its many features.
Seldon helps businesses manage, deploy, and monitor machine learning and AI models, offering flexible tools for real-time workflows and observability.
Test and protect ML models from adversarial attacks
Adversarial Robustness Toolbox offers open-source tools to test, defend, and certify machine learning models against security threats. Python-focused site.
Explore an open-source resource hub for probabilistic machine learning, featuring books, tutorials, and code to help you learn and apply ML concepts.
Keras offers user-friendly tools and guides for building deep learning models, making machine learning accessible and efficient for developers of all levels.
Explore scikit-optimize, a Python library for efficient hyperparameter optimization using sequential model-based methods. Includes guides and docs.
Prometheus is an open-source tool for monitoring systems and analyzing time series data with powerful metrics, alerts, and flexible querying.
Generate knowledge graphs easily with RML.io tools for Windows, Mac, and Linux. Use simple rules to turn your data into structured, connected insights.
PipelineDP lets you build data pipelines that aggregate user data efficiently while keeping privacy protected with modern, secure techniques.
Get real-time crisis alerts with Samdesk, a global platform that uses AI and big data to help you monitor and respond to disruptions as they happen.
Explore open-source tools for big data genomics, including ADAM and Cannoli, for scalable genomic data analysis using Spark, Python, and R.
Vega lets you create, edit, and share interactive data visualizations using a simple JSON format, perfect for exploring and presenting your data visually.
OpenActive provides open data tools and resources to help people access sport and physical activity opportunities, making it easier to get active.
twarc is a command line tool and Python library for collecting, archiving, and analyzing Twitter JSON data using the Twitter API, with plugin support.
Explore detailed documentation and guides for the hdbscan Python library, which helps you find clusters in data using advanced machine learning techniques.
Explore detailed guides and documentation for emcee, a Python tool for Markov chain Monte Carlo (MCMC) sampling and model fitting in data analysis.
Find detailed, accurate IP address data with IPinfo.io. Access geolocation, privacy, and company info by API or database for secure, reliable results.
Clarify lets industrial businesses connect, analyze, and automate their operational data for better insights and smarter decision-making.
Explore and visualize U.S. public data with interactive charts, maps, and reports. Find insights on industries, locations, education, and more.
Holistics is a self-service analytics platform that lets teams explore, visualize, and share data insights using modern BI and DevOps best practices.
Frictionless Data offers open-source tools and standards to simplify working with complex data, making integration and management easier for teams and individuals.
lakeFS is an open-source tool that brings Git-like version control to your data, helping you manage and track changes in cloud object storage easily.
BIDS is a community-driven platform that provides a standard for organizing and sharing neuroimaging and behavioral data to simplify research collaboration.
Piano Audience helps you collect, segment, and activate customer data so you can understand your audience and personalize experiences across your business.
Analyze Java, Android, and Node.js GC logs online to detect memory issues, long pauses, and get tuning tips. Free, easy JVM garbage collection log analyzer.
Get real-time price insights for retail and industry in Brazil, powered by crowdsourced data from both physical stores and e-commerce, even in remote regions.
Manage, analyze, and report data from all your energy plant devices in one place. Access SCADA screens and detailed energy reports easily. (Turkish site)
Track and visualize machine learning experiments, monitor model metrics, and debug training runs with Neptune.ai's experiment tracking platform.
BigML is an easy-to-use machine learning platform for building models, making predictions, and analyzing data without complex setup or coding.
Stan is an open-source platform for Bayesian data analysis and statistical modeling, offering tools, documentation, and a supportive user community.
Netron lets you open and visualize neural network, deep learning, and machine learning models right in your browser for easy exploration.
LAION is a nonprofit sharing open machine learning datasets, tools, and models to support research, education, and accessible AI development for everyone.
Apache Mahout is a distributed linear algebra and machine learning platform for building custom algorithms, designed for data scientists and developers.
Flyte helps you build, deploy, and manage scalable data and machine learning workflows, making it easy to unify your data, ML, and analytics projects.
Weights & Biases helps AI developers track experiments, manage models, and streamline machine learning workflows from training to production.
Dataiku is a platform to build, deploy, and manage AI and analytics projects, helping teams turn data into business insights and smarter decisions.
Explore benchmark datasets and results for computer vision and machine learning, with a focus on German traffic sign recognition and classification.