This is the archived website for DAT Linux 1. Please visit datlinux.com for the DAT Linux 2 site.
🧰️ PRO |
Introducing: DAT Linux PRO tools. Enhance your DAT Linux with extra power-tools including back-up/restore, app update notifications, app monitoring, custom links tab, dark theme, etc. One payment, perpetual license. Get PRO now! |
Introduction
DAT Linux is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment. It’s based on Ubuntu 22.04, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs. Read the FAQ.
📚️ Check out the DAT Linux curated list of free online data science e-books!
DAT Linux is perfect for students, professionals, academics, or anyone interested in data science who doesn’t want to spend endless hours downloading, installing, configuring, and maintaining applications from a range of sources, each with different technical requirements and set-up challenges.
👍 Recommend DAT Linux on DistroWatch
Get started:
-
⬇️ Download DAT Linux, and get on with doing data science without the headaches.
-
ℹ️ FAQ for more answers to some questions you may have.
-
💬️ Github Community channel for annoucements, or to post any feedback, issues or general queries.
List of supported data science apps:
💳️ Please subscribe/donate to help support DAT Linux development
App | Description | |
---|---|---|
|
BiRT | Eclipse BIRT™ is an open source reporting system for producing compelling BI reports |
|
ClickHouse | ClickHouse is an open-source column-oriented DBMS for online analytical processing |
|
Data Cleaner | Data Quality toolkit that allows you to profile, correct, and enrich your data |
|
Datasette | Datasette is a tool for exploring and publishing data visually and with SQL |
|
DB Browser | DB Browser for SQLite is a visual, open source tool to create, design, and edit database files compatible with SQLite |
|
DBeaver | Free multi-platform database tool for developers, database administrators, analysts and all people who need to work with databases |
|
Druid | Apache Druid is a real-time database to power modern analytics applications |
|
D-Search | Convenient interface to the “webtools” R package to search for datasets in –all– CRAN packages |
|
DuckDB | DuckDB is an in-process SQL OLAP database management system |
|
E-Git | EGit is an Eclipse based GUI for the Git version control system |
|
Emacs+ESS | Emacs Speaks Statistics (ESS) is an add-on package for GNU Emacs to interact with statistical analysis programs such as R, S-Plus, SAS, Stata and OpenBUGS/JAGS |
|
Gephi | Gephi is the leading visualization and exploration software for all kinds of graphs and networks |
|
Glue-viz | Glue is a UI and Python library to explore relationships within and among related datasets |
|
Gnumeric | Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project |
|
GNU Plot | gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits |
|
Grafana | Grafana is a popular open-source platform for data visualization and monitoring |
|
G-Vim | A GUI wraper for the Vim screen-based text editor program, with plugins for R installed |
|
IPython | A command shell for interactive computing with a convenient console launcher |
|
Julia | Julia is a high-level, high-performance, dynamic programming language |
|
Jupyter Notebook | The Jupyter Notebook is a web-based interactive, scientific computing platform |
|
Jupyter Lab | JupyterLab is the latest web-based interactive development environment for notebooks, code, and data |
|
KNIME | KNIME Analytics Platform is open source software for data science |
|
LabPlot | Free, open source and cross-platform Data Visualization and Analysis software accessible to everyone |
|
LibreOffice Calc | LibreOffice Calc is the spreadsheet component of the LibreOffice software package |
|
Luigi | Luigi provides a framework to develop and manage data processing pipelines |
|
Meld | Meld is a visual file diff and merge tool |
|
Metabase | Metabase is an open-source business intelligence tool |
|
MOA | MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms |
|
OpenRefine | OpenRefine is an open-source desktop application for data cleanup and transformation to other formats |
|
Orange | Orange is a powerful platform to perform data analysis and visualization |
|
Paraview | ParaView is an open-source, multi-platform data analysis and visualization application |
|
Pluto | A Pluto notebook is made up of small blocks of Julia code (cells) and together they form a reactive notebook |
|
PSPP | GNU PSPP is a program for statistical analysis of sampled data. It is a free as in freedom replacement for the proprietary program SPSS |
|
QGIS | QGIS is a Free and Open Source Geographic Information System |
|
Quarto | Quarto® is an open-source scientific and technical publishing system built on Pandoc |
|
R | R is a free software environment for statistical computing and graphics |
|
R-Studio | RStudio is an Integrated Development Environment (IDE) for R |
|
Scilab | Scilab is a free and open-source, cross-platform numerical computational package and a high-level, numerically oriented programming language |
|
Spyder | Spyder is a free and open source scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts |
|
Superset | Apache Superset is a modern, enterprise-ready business intelligence web application |
|
Tabula | Tabula is a free tool for extracting data from PDF files into CSV and Excel files |
|
Veusz | Veusz is a scientific plotting and graphing program with a graphical user interface, designed to produce publication-ready 2D and 3D plots |
|
Visidata | Visidata is an interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, which can handle millions of rows with ease |
|
VSCodium | VSCodium is a community-driven, freely-licensed binary distribution of Microsoft’s editor VS Code (ready with plugins for R/RMarkdown, Python/Jupyter, Julia) |
|
Weka | Weka is a GUI and collection of machine learning algorithms for data mining tasks |
|
WxMaxima | wxMaxima is a document based interface for the computer algebra system Maxima |
|
Zeppelin | Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more |
NUMPY BY EXAMPLE - A Beginner's Guide to Learning NumPy by the DAT Linux team.
🛒️ BUY the PDF or EPUB e-book from Leanpub. |