Tim Blog

Himmel oder Hölle

linux

process monitor htop #installation sudo apt install htop

recommendation engine

recommandation engine inside postgres madlib self-built engine setup table structure CREATE TABLE orders (id int, product_id int); CREATE TABLE products(id serial, name text); INSERT INTO orders ...

Fastai

installation !pip install fastai !conda install -c fastai -c pytorch fastai ## library ### fastcore fastscript library for fast scripting from fastscript import *@call_parse def main( msg:P...

Camelot

camelot allow user to tweak table extraction from pdf. extracted table can be pandas DataFrame or other formats, including JSON, Excel, HTML, and Sqlite. Installation and usage !pip install "came...

sk-learn

find suitable algorithm classification def create_baseline_classifiers(seed=8): """Create a list of baseline classifiers. Parameters ---------- seed: (optional) An integer to...

topic modeliing

Methods LDA # initisalise LDA Model lda_model = LatentDirichletAllocation(n_components = 10, # number of topics random_state = 10, # random state ...

postgresql

traditional RDBMS (relational database management system). Mainly used for relational data, it is object-oriented in nature. useful operation like in dataframe Specify your own custom functions...

tableau

Installation and usage set up python environments #connect python environment with tableau pip install tabpy #connect jupyter environment with tableau pip install tabpy_client run tabpy run fol...

dask

Dask is a parallel computing library that works by distributing larger computations and breaking it down into smaller computations through a task scheduler and task workers. it consists of three co...

dataset source

dataset search engine https://datasetsearch.research.google.com data source Awesome Data data stored in github https://github.com/awesomedata/awesome-public-datasets Data Is Plural data ...