Table of Contents

Python

Virtual environment

Creating and sourcing python virtual environment

virtualenv NAME
source NAME/bin/activate
which python

To deactivate

deactivate

Functional

Equivalent ways of filtering a list

list(filter(lambda n: n % 2 == 0, range(10)))
[n for n in range(10) if n % 2 == 0]

pip

Useful dependencies stack:

pip install numpy pandas pillow opencv-python dominate torch pickle-mixin scipy

Install pip dependency from github

pip install -U git+https://github.com/pytorch/vision

Install conditional version

pip install "pillow<7"

Images

import cv2
img = cv2.imread("image.png") #numpy.ndarray

Useful

Progress bar for iterations

from tqdm import tqdm
from tqdm import tqdm_notebook as tqdm #for Google Colab
 
for i in tqdm(range(100)):
  pass

Spreading dictionary (useful in passing arguments to functions or augmenting dictionaries/arrays)

d = {'one': 1, 'two': 2}
{*d} # {'one', 'two'}
{**d} #{'one': 1, 'two': 2}

Dictionaries can be created from arrays analogously to arrays

boolValues = {i: True for i in some_array}

Iterating dictionary over keys and values

for (key, val) in db.items():
  print(key, val)

Equivalent ways of appending data to list

res = []
for key, elem in hmap.items():
    res = [*res, {key, np.median(elem)}]
 
for key, elem in hmap.items():
    res.append({key, np.median(elem)})    

Reading/writing file

with open('old_request.txt') as file:
    contents = file.read()
    file.write("text")

matplotlib

Display an image from the matrix x using imshow or matshow

import matplotlib.pyplot as plt
plt.imshow(x)
import matplotlib.pyplot as plt
plt.matshow(x)

Show a grid of images (2 x 4)

fig = plt.figure()
for i in range(7):
  ax = plt.subplot(2, 4, i + 1)
  plt.show(imgs[i])
  plt.tight_layout()
  plt.set_title(f"Sample {i}")
  ax.axis("off")

Useful options

fig = plt.figure(figsize=(10,6))
plt.grid(True)
ax.legend(['AU', 'GB', 'US'])
ax.set_xlabel("Check Out hour");
ax.set_xticks(range(0,24));

pandas

import pandas as pd
df = pd.read_csv("./file.csv")
df.head()
df.keys()

Generic

df = df.rename(columns={"id": "paper_sha", "paragraph": "text"})
df = df.rename_axis("_id")
df.dropna()
df.dropna(subset=['name', 'born'])
df.to_csv('csv2sql2.csv',index=True)

Take certain columns

df.loc[:, cols[4]:cols[10]]
df.loc[:, cols[4]:]
df.loc[:, cols[4], cols[-1]]

Delete column

del df['column_name']

Get unique values

metadata['source_x'].unique()

Filter data

df.loc[df['table'] == True]

Sum column entries

some_partial_sum = []
for i in df.keys():
   col_sum = df[i].sum()
   sum_partial_sum.append(col_sum)

Reshaping data for ML models

days = np.array([i for i in range(len(dates))]).reshape(-1, 1)
some_partial_sum = np.array(some_partial_sum).reshape(-1, 1)

Find total number of rows with missing entries

df.isnull().sum().sum()

Constructing datetime structure from a given start date

import datetime
start = '1/22/2020'
start_date = datetime.datetime.strptime(start, '%m/%d/%Y')
future_forcast_dates = []
for i in range(len(future_forcast)):
    future_forcast_dates.append((start_date + datetime.timedelta(days=i)).strftime('%m/%d/%Y'))
adjusted_dates = future_forcast_dates[:-5]

Useful in ML

Train-test split

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x_vals, y_vals, test_size=0.1, shuffle=False) 

Iterating over files in folder

import os
for f in os.listdir("folder"):
   path = os.path.join("folder/", f)

One hot encoding for N categories could be obtained as vectors from identity matrix

import numpy as np
encoding = np.eye(N)

Counting True elements in the list

[True, False, True, True].count(True)

Move color channel of input_tensor from 0 to 2 index

np.einsum('ijk->jki', input_tensor)

Alternatively tensor can be “transposed” to move color channel as

t.transpose((1,2,0))

Fun

Valentines plot

import numpy as np
import matplotlib.pyplot as plt
import base64
 
fig = plt.figure()
ax = fig.gca()
 
t = np.linspace(0, 2 * np.pi, 100)
x = 16*np.sin(t)**3
y = 13*np.cos(t) - 5*np.cos(2*t) - 2*np.cos(3*t) - np.cos(4*t)
 
ax.plot(x, y)
plt.axis('off')
plt.text(min(x)/2, 0, base64.b64decode(b'SGFwcHkgVmFsZW50aW5lcyBicm8h').decode('utf-8'), color="red", fontsize=14)
plt.show()