====== AI / ML - Machine learning ====== https://jalammar.github.io/visualizing-pandas-pivoting-and-reshaping/ https://stanford-cs329s.github.io/syllabus.html ===== NLP ===== * https://allennlp.org/ * https://github.com/facebookresearch/PyText * https://github.com/NNLP-IL/Resources#named-entity-recognition-ner ===== AutoML ===== * [[https://github.com/AxeldeRomblay/MLBox|MLBox]] * [[https://github.com/salesforce/TransmogrifAI|TransmogrifAI]] * [[http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html|h2o automl]] * [[https://github.com/jhfjhfj1/autokeras|autokeras]] * [[http://epistasislab.github.io/tpot/|tpot]] * https://www.automl.org/automl/ ===== Serving Engines ===== * [[https://predictionio.apache.org/community/projects/#demos|predictionio]] * [[https://github.com/combust/mleap/|MLeap]] * [[https://www.h2o.ai/products/h2o-sparkling-water/|h2o-sparkling-water]] ==== ML Serving engines / ML Scoring engine description ==== === MLeap === ככל הנראה שהעיקר מוטיבציה זה ליצור BUNDEL אחיד בנושא נתונים: ישנה סיריאליזציה של נתונים של ML PIPELINE יש DOCKER ל SERVING חסורון : תחזוקתי שלא עושה WARM UP למודול (יתכן שישפיע על הביצועים) יתרון: התנהלות REST ===== formats ===== * [[http://dmg.org/pfa/docs/exoplanets/|pfa]] * [[https://openscoring.io/|PMML]] * MLeap * Spark ===== speech recognition ===== * https://cmusphinx.github.io/ ==== Links ==== https://keras.io/ http://scikit-learn.org/stable/ [[https://dzone.com/articles/11-open-source-frameworks-for-ai-and-machine-learn?edition=371223|AI ML - top 11 open source fw]] [[https://www.seldon.io|seldon]] - seldon core - Open source platform for deploying machine learning models on Kubernetes [[http://maxpumperla.com/elephas/|elephas - Deep learning on Spark with Keras]] [[https://github.com/databricks/spark-deep-learning|databricks/spark-deep-learning]] [[https://polyaxon.com/|polyaxon]] - An open source platform for reproducible machine learning at scale. [[https://www.kubeflow.org/|kubeflow]] - The Machine Learning Toolkit for Kubernetes [[http://www.pachyderm.io/|pachyderm]] - Pachyderm lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. [[https://chainer.org/|chainer]] - A Powerful, Flexible, and Intuitive Framework for Neural Networks [[https://pytorch.org/|pytorch]] - An open source deep learning platform that provides a seamless path from research prototyping to production deployment. [[https://developer.nvidia.com/tensorrt|NVIDIA TensorRT]] - NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms. [[https://caffe2.ai/|caffe2]] A New Lightweight, Modular, and Scalable Deep Learning Framework