Data Science workgroup: Difference between revisions

imported>Ixxie
No edit summary
DoggoBit (talk | contribs)
No edit summary
 
(12 intermediate revisions by 8 users not shown)
Line 1: Line 1:
{{delete|reason=This workgroup may no longer be active. Last substantial page edit before import.}}{{outdated|Other than site-wide fixes, this page has not seen recent updates.}}
This workgroup is dedicated towards improving the state of the data science stack in Nixpkgs. This includes work on packages and modules for scientific computation, artificial intelligence and data processing, as well as data science IDEs.
=== JupyterLab ===
The [https://github.com/tweag/jupyterWith JupyterWith] repo "provides a Nix-based framework for the definition of declarative and reproducible Jupyter environments. These environments include JupyterLab - configurable with extensions - the classic notebook, and configurable Jupyter kernels."
Alternatively, [https://github.com/NixOS/nixpkgs/pull/49807 there is an unmerged pull request] with work to easily deploy arbitrary kernels and jupyter extensions with nix. There are some limitations due to jupyterlab extensions relying heavily on npm and webpack to compile the javascript modules. Thus an unpure setup was considered easiest to get it working. If the pull request were merged, the following `default.nix` shell would install 20 jupyerlab extension + 4 kernels (c, python, go, and ansible). All that would need to be edited by user would be kernels, additionalExtensions, and buildInputs. The rest would be automatic and would launch a jupyterlab instance for you.
<syntaxhighlight lang="nix">
{ pkgs ? import <nixpkgs> {}, pythonPackages ? pkgs.python36Packages }:
let kernels = [
      pkgs.python36Packages.ansible-kernel
      pythonPackages.jupyter-c-kernel
      pkgs.gophernotes
    ];
    additionalExtensions = [
      "@jupyterlab/toc"
      "@jupyterlab/fasta-extension"
      "@jupyterlab/geojson-extension"
      "@jupyterlab/katex-extension"
      "@jupyterlab/mathjax3-extension"
      "@jupyterlab/plotly-extension"
      "@jupyterlab/vega2-extension"
      "@jupyterlab/vega3-extension"
      "@jupyterlab/xkcd-extension"
      "jupyterlab-drawio"
      "@jupyterlab/hub-extension"
      "jupyterlab_bokeh"
    ];
in
pkgs.mkShell rec {
  buildInputs = [
    ### Base Packages
    pythonPackages.jupyterlab pkgs.nodejs
    ### Extensions
    pythonPackages.ipywidgets
    pythonPackages.ipydatawidgets
    pythonPackages.ipywebrtc
    pythonPackages.pythreejs
    pythonPackages.ipyvolume
    pythonPackages.jupyterlab-git
    pythonPackages.jupyterlab-latex
    pythonPackages.ipyleaflet
    pythonPackages.ipympl
  ] ++ kernels;
  shellHook = ''
    TEMPDIR=$(mktemp -d -p /tmp)
    mkdir -p $TEMPDIR
    cp -r ${pkgs.python36Packages.jupyterlab}/share/jupyter/lab/* $TEMPDIR
    chmod -R 755 $TEMPDIR
    echo "$TEMPDIR is the app directory"
    # kernels
    export JUPYTER_PATH="${pkgs.lib.concatMapStringsSep ":" (p: "${p}/share/jupyter/") kernels}"
# labextensions
${pkgs.lib.concatMapStrings
    (s: "jupyter labextension install --no-build --app-dir=$TEMPDIR ${s}; ")
    (pkgs.lib.unique
      ((pkgs.lib.concatMap
          (d: pkgs.lib.attrByPath ["passthru" "jupyterlabExtensions"] [] d)
          buildInputs) ++ additionalExtensions))  }
jupyter lab build --app-dir=$TEMPDIR
# start jupyterlab
jupyter lab --app-dir=$TEMPDIR
  '';


This workgroup is dedicated towards improving the state of the data science stack in Nixpkgs. This includes work on packages and modules for scientific computation, artificial intelligence and data processing, as well as data science IDEs.
}
</syntaxhighlight>


There have been some great examples of great work done on libraries:
Some recent examples of work done on libraries:


* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+nlp+ nlp]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+nlp+ nlp]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+sklearn scikit-learn]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+sklearn scikit-learn]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+tensorflow tensorflow]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+tensorflow tensorflow]


There has also been notable work on the data science infra :
There has also been notable work on the data science infra :
Line 14: Line 87:
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+jupyterhub Jupyterhub]
* [https://github.com/NixOS/nixpkgs/pulls?utf8=%E2%9C%93&q=is%3Apr+jupyterhub Jupyterhub]


with such highlights as @aborsu's [https://github.com/aborsu/nixpkgs/blob/22ef965da38cc5e3457fe2d848b8a789cb6ad207/nixos/modules/services/development/jupyter/default.nix Jupyter kernels written in Nix]:
 
with such highlights as [https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/development/jupyter/default.nix Jupyter kernels written in Nix]:


{{file|./modules/datasci.nix|nix|<nowiki>
{{file|./modules/datasci.nix|nix|<nowiki>
Line 52: Line 126:
== Channels ==
== Channels ==


[irc://irc.freenode.net/nixos-data ''#nixos-data'' on Freenode]
[https://matrix.to/#/#datascience:nixos.org #datascience:nixos.org on Matrix]


== People ==
== People ==


[[User:Ixxie|Ixxie]]
[[User:Ixxie|Ixxie]]