Becoming a Expert Python
These guides aim to understand the development and execution environment of Python. In addition, I will cover topics ranging from language fundamentals, good practices, build, deploy, distribution to advanced language programming topics.
Estes guias tem por objetivo compreender o ambiente de desenvolvimento e execução de Python. Além disso, vou cobrir tópicos que envolvem desde os fundamentos da linguagem, boas práticas, build, deploy e distribuição
- NOTE: Tópicos avançados de programação com Python ficaram em outro contexto.
Summary
How to install and set up a Python
Preparing the Environment for the Python
This topic describe how to set up the environment to Python developement.
Fundamentals
- What's is Python ?
- Zen of Python
- Types
- Interpreter and Compiler
- Complete Documentation
- Main()
- Executing modules as scripts
- Options Command
-ccommand
- Language limitations
- GIL
- Python Files
- .py
- .pyc
- .pyo
- .egg
__init__.py__main__.py- Requirements File
- Pipfile and Pipfile.lock
Best Pratices
- Identation and Length
- Line Break After a Binary Operator
- Naming
- Encoding
- Strings
' 'and" " - Comments
# - Imports
- Dunders to Documentation
- Annotation Functions
- Type Hints
- String Concatenation
- String Methods
- Exception
- Return
- Type Comparisons
- Methods with numerous parameters
- Docstrings
Awesome Python
Data Engineering
Artificial Inteligence
- Computer Vision
- Machine Learning
- Deep Learning
- Network Virtualization
- Text Processing
- Natural Language Processing
- ChatOps Tools
- Image Processing
- Search
- Robotics
- Science
- General-Purpose Machine Learning
- Data Analysis / Data Visualization
- Misc Scripts / iPython Notebooks / Codebases
- Neural Networks
- Kaggle Competition Source Code
- Reinforcement Learning
Databases
Security
Operation
- Date and Time
- Built-in Classes Enhancement
- Command-line Interface Development
- Command-line Tools
- Environment Management
- Files
- Networking
- Audio
- Hardware
- Video
DevOps
- Parsing
- Processes
- Concurrency and Parallelism
- Distributed Computing
- Compatibility
- Configuration
- Debugging Tools
- DevOps Tools
- Distribution
- Documentation
- Downloader
- Logging
- Job Scheduler
- Continuous Integration
Cloud
Python
- Algorithms and Design Patterns
- Code Analysis
- Code Quality
- Functional Programming
- Implementations
- Interactive Interpreter
- Package Management
- Package Repositories
- Testing
- Editor Plugins and IDEs
- Build Tools
Managemant Libraries
Web
- Admin Panels
- CMS
- Forms
- HTML Manipulation
- HTTP Clients
- News Feed
- Static Site Generator
- URL Manipulation
- Web Asset Management
- Web Content Extracting
- Web Crawling
- Web Frameworks
- WebSocket
- WSGI Servers
- Tagging
- Template Engine
Miscellaneous
Services
**Curso em Vídeo: resolutions exercises**
- Learning Python in portuguese ! - Class notes and exercises solved - Teacher: Gustavo Guanabara.
| Words | Themes |
|---|---|
| 1 | Fundamentals |
| 2 | Control Structures |
| 3 | Compound Structures |
| 4 | Functions |
FAQ
- How do I configure my computer to run Python code?
- How do I configure my computer to develop in Python?
- What are the best practices to prepare an environment that runs Python?
- What is a requirements.txt file ?
- How to ensure a fully reproducible (100% equal) environment ?
- How is the virtual environment Python executable able to use something different from the system site packages ?
- When use golang in place Python
How to install and set up a Python
On Linux, make sure you have the right version of Python pre-installed, and the basic developer toolset available. Makes sure of that:
- Install the latest version of Python.
sudo apt install python3.8- Satisfy some system requirements
sudo apt install build-essential\
libffi-dev\
python3-pip\
python3-dev\
python3-venv \
python3-setuptools\
python3-pkg-resources- Create and activate Python virtual environment
cd your-project
python3 -m venv venv
source venv/bin/activateNOTE for beginners:
A Python virtual environment is a local interpreter that allows to install dependencies without polluting the global Python interpreter. There are different ways to create virtual environments (virtualenv; -m venv) and to install packages (pip install; easy_install), which may be confusing at the beginning.
- Install tools
sudo apt install git- Vim editor (git's default editor)
sudo apt install vim- Install Libraries used in this project
pip3 install --user od \
numpy \
pandas \
matplotlib \
virtualenv \
jupyter \
mysql-connector-pythonCheck Python Configuration
- Check what version Python
python --version
# Python 3.6.7If return Python2, try set a alias in file .bashrc
# Python
alias python=python3
- Check where installed Python
which python
# /usr/bin/pythonPreparing Environment
Enviornment Variables
-
To individual project
PYTHONPATHsearch path until module. -
To interpreter
PYTHONHOMEindicate standard libraries.
Configure Python PATH
- First open profile in editor
sudo vim ~/.profileor
sudo vim ~/.bashrc- Insert Python PATH
export PYTHONHOME=/usr/bin/python<NUMER_VERSION>NOTE: quit vim: ESC, :wq
- Update profile/bashrc
source ~/.bashrc
# or
. ~/.bashrcInstall multiple Python3
update-alternatives symbolic links determining default commands
- Execute in terminal
sudo update-alternatives --config pythonIf return error: update-alternatives: error: no alternatives for python3, following to step 2
- Install multiples Python
update-alternatives --install /usr/bin/python python /usr/bin/python<NUMER_VERSION> 1
update-alternatives --install /usr/bin/python python /usr/bin/python<OTHER_NUMER_VERSION> 2- Change Python versions
sudo update-alternatives --config pythonsudo update-alternatives --set python /usr/bin/python3.6- Check changes
python --version
# Python 3.8Requirements File
Requirements files is file containing a list of items to be installed using pip install.
- Generate file
requirements.txt
pip3 freeze > requirements.txtor
venv/bin/pip3 freeze > requirements.txt
cat requirements # image bellow- Visualize instaled libraries
pip3 freeze- Install libraries in requirements
pip3 install -r requirements.txt-r recursive
Virtual Environment
The Python can is executed in a virtual environment with semi-isolated from system.
When Python is initiating, it analyzes the path of its binary. In a virtual environment, it's actually just a copy or Symbolic link to your system's Python binary. Next, set the sys.prefix location which is used to locate the site-packages (third party libraries)
Quando o Python está iniciando, ele analisa o caminho do seu binário. Em um virtual environment, na verdade, é apenas uma cópia ou Symbolic link para o binário Python do seu sistema. Em seguida, define o local sys.prefix que é usado para localizar o site-packages(third party libraries).
Symbolic link
sys.prefixpoints to the virtual environment directory.sys.base.prefixpoints to the non-virtual environment.
Example, how keep the files in folder of virtual environment:
ll
# random.py -> /usr/lib/python3.6/random.py
# reprlib.py -> /usr/lib/python3.6/reprlib.py
# re.py -> /usr/lib/python3.6/re.py
# ...tree
├── bin
│ ├── activate
│ ├── activate.csh
│ ├── activate.fish
│ ├── easy_install
│ ├── easy_install-3.8
│ ├── pip
│ ├── pip3
│ ├── pip3.8
│ ├── python -> python3.8
│ ├── python3 -> python3.8
│ └── python3.8 -> /Library/Frameworks/Python.framework/Versions/3.8/bin/python3.8
├── include
├── lib
│ └── python3.8
│ └── site-packages
└── pyvenv.cfgCreate Virtual Environment
$ virtualenv -p python3 NAME_ENVIRONMENT
(env) $or
$ python3 -m venv NAME_ENVIRONMENT
(env) $To begin using the virtual environment, it needs to be activated
Execute activate script
source <DIR>/bin/activateReferences
Pipenv
- Package manager:
Pipefile - Virtual environment:
$HOME/.local/share - Lock package:
Pipefile.lock
Why use pipefile?
Using pip and requirements.txt file, have a real issue here is that the build isn’t deterministic. What I mean by that is that, given the same input (the requirements.txt file), pip doesn’t always produce the same environment.
What is pipefile?
It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important Pipfile.lock, which is used to produce deterministic builds.
Features:
- Deterministic builds
- Separates development and production environment libraries into a single file
Pipefile - Automatically adds/removes packages from your
Pipfile - Automatically create and manage a virtualenv
- Check PEP 508 requirements
- Check installed package safety
Comparisons
# Pipfile
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
matplotlib = "==3.1.3"
[packages]
requests = "*"
numpy = "==1.18.1"
pandas = "==1.0.1"
wget = "==3.2"
[requires]
python_version = "3.8"
platform_system = 'Linux'
# requirements.txt
requests
matplotlib==3.1.3
numpy==1.18.1
pandas==1.0.1
wget==3.2
Install
pip3 install --user pipenvCreate Pipfile and virtual environment
pipenv --python 3
# Creating a virtualenv for this project…
# Pipfile: /home/campos/projects/becoming-a-expert-python/Pipfile
# Using /usr/bin/python3.8 (3.8.2) to create virtualenv…
# ⠼ Creating virtual environment...created virtual environment CPython3.8.2.final.0-64 in 256ms
# creator CPython3Posix(dest=/home/campos/.local/share/virtualenvs/becoming-a-expert-python-fmPL6zJP, clear=False, global=False)
# seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, via=copy, app_data_dir=/home/campos/.local/share/virtualenv/seed-app-data/v1)
# activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
# ✔ Successfully created virtual environment!
# Virtualenv location: /home/campos/.local/share/virtualenvs/becoming-a-expert-python-fmPL6zJP
# requirements.txt found, instead of Pipfile! Converting…
# ✔ Success!- See where virtual environment installed
pipenv --venvActivate environment
pipenv runInstall Libraries with Pipefile
pipenv install flask
# or
pipenv install --dev flaskCreate lock file
pipenv lock
# Locking [dev-packages] dependencies…
# Locking [packages] dependencies…
# ✔ Success!References
- Official documentation
- Gerenciando suas dependências e ambientes python com pipenv
- How are Pipfile and Pipfile.lock used?
Undertanding
Zen of Python
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!NOTE: PEP 20
Types
Examples:
# Converting real to integer
print 'int(3.14) =', int(3.14)
# Converting integer to real
print 'float(5) =', float(5)
# Calculation between integer and real results in real
print '5.0 / 2 + 3 = ', 5.0 / 2 + 3
# Integers in other base
print "int('20', 8) =", int('20', 8) # base 8
print "int('20', 16) =", int('20', 16) # base 16
# Operations with complex numbers
c = 3 + 4j
print 'c =', c
print 'Real Part:', c.real
print 'Imaginary Part:', c.imag
print 'Conjugate:', c.conjugate()
# int(3.14) = 3
# float(5) = 5.0
# 5.0 / 2 + 3 = 5.5
# int('20', 8) = 16
# int('20', 16) = 32
# c = (3+4j)
# Real Part: 3.0
# Imaginary Part: 4.0
# Conjugate: (3-4j)Interpreter and Compiler
CPython
Compiler and interpreter bytecode, write in C.
Jython
Compiler and interpreter Java bytecode, write in Java.
Comparian
Why use alter compiler python?
CPython: torna muito fácil escrever extensões C para seu código Python porque no final ele é executado por um interpretador C.
Jython:, por outro lado, torna muito fácil trabalhar com outros programas Java: você pode importar qualquer classe Java sem esforço adicional, chamando e utilizando suas classes Java de dentro de seus programas Jython.
How Python program run ?
- First, Python interpreter checks syntax (sequential)
- Compile and convert it to bytecode and directly bytecode is loaded in system memory.
- Then compiled bytecode interpreted from memory to execute it.
Programming Recommendations
"Readability counts"
Identation and Length
- 4 spaces
- Limit all line 72 characteres to docstring
- Limit all line 79 to code
- Statement of functions and flow, e.g:
# Aligned with opening delimiter.
foo = long_function_name(var_one=0.0, var_two=0.0,
var_three=0.0, var_four=0.0)Naming
- Class Name (camelCase):
CapWords() - Variables (snack_case):
cat_words - Constants:
MAX_OVERFLOW
Line Break After a Binary Operator
income = (gross_wages
+ taxable_interest
+ (dividends - qualified_dividends)
- ira_deduction
- student_loan_interest)Encoding
By default: UTF-8
# -*- coding: UTF-8 -*-
<code>Strings ' ' and " "
Single quotation marks and strings with double quotation marks are the same.
Comments #
- Fisrt word need upper case.
- Comments in-line separete by 2 spaces.
x = x + 1 # Compensar bordaImports
Following order:
- Standard library imports.
- Related third party imports. (parte de terceiros)
- Local application/library specific imports.
import argparse
import configparser
import os
import mysql.connector
import my_moduleYes:
import os
import sysNo:
import os, sysNo problems:
from subprocess import Popen, PIPEDunders to Documentation
__version__ = '0.1'
__author__ = 'Bruno Campos'String Concatenation
- Use
''.join(), to concatenate 3 or more:
os.path.dirname.join(stringA + stringB + stringC + stringD)- This optimization is fragile even in CPython. Not use:
stringA + stringB + stringC + stringDString Methods
- Use string methods instead of the string module because, String methods are always much faster.
- Use
''.startswith()and''.endswith()instead of string slicing to check for prefixes or suffixes.
Yes: if foo.startswith('bar'):
No: if foo[:3] == 'bar':Exception
Limit the clausule try: minimal code necessary.
Yes:
try:
value = collection[key]
except KeyError:
return key_not_found(key)
else:
return handle_value(value)No:
try:
# Too broad!
return handle_value(collection[key])
except KeyError:
# Will also catch KeyError raised by handle_value()
return key_not_found(key)- Objetivo de responder à pergunta "O que deu errado?" programaticamente, em vez de apenas afirmar que "Ocorreu um problema"
Return
"Should explicitly state this as return None"
- Be consistent in return statements.
- Todas as instruções de retorno em uma função devem retornar uma expressão ou nenhuma delas deve.
Yes:
def foo(x):
if x >= 0:
return math.sqrt(x)
else:
return NoneNo:
def foo(x):
if x >= 0:
return math.sqrt(x)Type Comparisons
- Always use
isinstance()
Yes: if isinstance(obj, int):
No: if type(obj) is type(1):Annotation Functions
"Don’t use comments to specify a type, when you can use type annotation."
- Atua como um linter (analisador de código para mostrar erros) muito poderoso.
- O Python não atribui nenhum significado a essas anotações.
- Examples:
Method arguments and return values
def func(a: int) -> List[int]:def hello_name(name: str) -> str:
return (f'Hello' {name}')Declare the type of a variable (type hints)
a = SomeFunc() # type: SomeTypeIsso informa que o tipo esperado do argumento de nome é str . Analogicamente, o tipo de retorno esperado é str .
Type Hints
def send_email(address, # type: Union[str, List[str]]
sender, # type: str
cc, # type: Optional[List[str]]
bcc, # type: Optional[List[str]]
subject='',
body=None # type: List[str]
):
"""Send an email message. Return True if successful."""
<code>TODO
References
- https://medium.com/@shamir.stav_83310/the-other-great-benefit-of-python-type-annotations-896c7d077c6b
- https://www.python.org/dev/peps/pep-0484/
- https://blog.jetbrains.com/pycharm/2015/11/python-3-5-type-hinting-in-pycharm-5/
Docstrings
- Docstrings must have:
- Args
- Returns
- Raises
Simple Example
def say_hello(name):
"""
A simple function that says hello...
Richie style
"""
print(f"Hello {name}, is it me you're looking for?")Example partner Google
def fetch_bigtable_rows(big_table, keys, other_silly_variable=None):
"""Fetches rows from a Bigtable.
Retrieves rows pertaining to the given keys from the Table instance
represented by big_table. Silly things may happen if
other_silly_variable is not None.
Args:
big_table: An open Bigtable Table instance.
keys: A sequence of strings representing the key of each table row
to fetch.
other_silly_variable: Another optional variable, that has a much
longer name than the other args, and which does nothing.
Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:
{'Serak': ('Rigel VII', 'Preparer'),
'Zim': ('Irk', 'Invader'),
'Lrrr': ('Omicron Persei 8', 'Emperor')}
If a key from the keys argument is missing from the dictionary,
then that row was not found in the table.
Raises:
IOError: An error occurred accessing the bigtable.Table object.
"""
return None__doc__
Such a docstring becomes the __doc__ special attribute of that object.
- Simple Example
print(say_hello.__doc__)
# A simple function that says hello... Richie style- Example partner Google
help()
- Create manual:
man - Is a built-in function help() that prints out the objects docstring.
>>> help(say_hello)
Help on function say_hello in module __main__:
# say_hello(name)
# A simple function that says hello... Richie styleScripts with Docstrings
- Docstrings must show how to use script
- Must doc:
- Usage: sintax command line
- Examples
- Arguments required and optional
"""
Example of program with many options using docopt.
Usage:
options_example.py [-hvqrf FILE PATH]
my_program tcp <host> <port> [--timeout=<seconds>]
Examples:
calculator_example.py 1 + 2 + 3 + 4 + 5
calculator_example.py 1 + 2 '*' 3 / 4 - 5 # note quotes around '*'
calculator_example.py sum 10 , 20 , 30 , 40
Arguments:
FILE input file
PATH out directory
Options:
-h --help show this help message and exit
--version show version and exit
-v --verbose print status messages
-q --quiet quiet mode
-f --force
-t, --timeout TIMEOUT set timeout TIMEOUT seconds
-a, --all List everything.
"""
from docopt import docopt
if __name__ == '__main__':
arguments = docopt(__doc__, version='1.0.0rc2')
print(arguments)Functions with Docstrings
A docstring to a function or method must resume:
- behavior
- arguments required
- arguments optional
- default value of arguments
- returns
- raise Exceptions
Example
def says(self, sound=None):
"""Prints what the animals name is and what sound it makes.
If the argument `sound` isn't passed in, the default Animal
sound is used.
Parameters
----------
sound : str, optional
The sound the animal makes (default is None)
Raises
------
NotImplementedError
If no sound is set for the animal or passed in as a parameter.
"""
if self.sound is None and sound is None:
raise NotImplementedError("Silent Animals are not supported!")
out_sound = self.sound if sound is None else sound
print(self.says_str.format(name=self.name, sound=out_sound))Class with Docstrings
A docstring para uma classe deve resumir seu comportamento e listar os métodos públicos e variáveis ​​de instância. Se a classe se destina a ser uma subclasse e possui uma interface adicional para subclasses, essa interface deve ser listada separadamente (no docstring). O construtor de classe deve ser documentado na docstring para seu método init . Os métodos individuais devem ser documentados por seus próprios docstring.
Example
class SimpleClass:
"""Class docstrings go here."""
def say_hello(self, name: str):
"""Class method docstrings go here."""
print(f'Hello {name}')Class docstrings should contain the following information:
- A brief summary of its purpose and behavior
- Any public methods, along with a brief description
- Any class properties (attributes)
- Anything related to the interface for subclassers, if the class is intended to be subclassed
References
- PEP 08
- PEP 484
- PEP 257
- https://realpython.com/python-pep8/#naming-conventions
- https://pep8.org
- Style guide Google: https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings
Methods with numerous parameters
Methods with numerous parameters are a challenge to maintain, especially if most of them share the same datatype.
These situations usually denote the need for new objects to wrap the
numerous parameters.
Example(s):
- too many arguments
def add_person(birthYear: int, birthMonth: int, birthDate: int,
height: int, weight: int,
ssn: int):
'''too many arguments'''
. . .- preferred approach
def add_person(birthdate: 'Date',
measurements: 'BodyMeasurements',
ssn: int):
'''preferred approach'''
. . .Cyclomatic Complexity
cyclomatic complexity counts the number of decision points in a method
Basic Comands
- Libraries
- Function print
- Types data
- Numeric systems
- libs matematics
Control Structure
- Conditional
- Repeatition
- Functional Programming
Simple Data Structure
- Tuples
- List
- Dict
Functions
- Defining Functions
- Documentation
- Default arguments
- Packing and unpacking arguments
- Variable Scope
- Global variable
- Constants
- function recursive
- Lambda Expressions
Do global variables evil?
global variables are bad in any programming language.
However, global constants are not conceptually the same as global variables;
global constants are perfectly fine to use.
so when you need a constant you have to use a global.
- http://wiki.c2.com/?GlobalVariablesAreBad
To make code more modular, the first step is always to move all global variables into a "config" object.
Violating Pure Function definition
I believe that a clean and (nearly) bug-free code should have functions that are as pure as possible (see pure functions). A pure function is the one that has the following conditions:
A função sempre avalia o mesmo valor de resultado, dado o (s) mesmo (s) valor (es) do argumento. O valor do resultado da função não pode depender de qualquer informação ou estado oculto que possa mudar enquanto a execução do programa prossegue ou entre diferentes execuções do programa, nem pode depender de qualquer entrada externa de dispositivos de E / S (normalmente - veja abaixo). A avaliação do resultado não causa nenhum efeito colateral observável semanticamente, como a mutação de objetos mutáveis ​​ou a saída para dispositivos de E / S. Ter variáveis ​​globais está violando pelo menos um dos itens acima, se não ambos, pois um código externo provavelmente pode causar resultados inesperados.
Outra definição clara de funções puras: "Função pura é uma função que toma todas as suas entradas como argumentos explícitos e produz todas as suas saídas como resultados explícitos". [1] Ter variáveis ​​globais viola a idéia de funções puras, já que uma entrada e talvez uma das saídas (a variável global) não está sendo explicitamente dada ou retornada.
Violating Unit testing F.I.R.S.T principle
Further on that, if you consider unit-testing and the F.I.R.S.T principle (Fast tests, Independent tests, Repeatable, Self-Validating and Timely) will probably violate the Independent tests principle (which means that tests don't depend on each other).
Configuration File
There are ways to manage the configuration:
- Using built-in data structure
- Using external configuration file
- json
- ini
- Using environment variables
- Using dynamic loading
Using built-in data structure
Use dictionary, ex:
DATABASE_CONFIG = {
'host': 'localhost',
'dbname': 'company',
'user': 'user',
'password': 'password',
'port': 3306
}Must is file separed, how example `config.py`
Using environment variables
The configuration values are not managed as a separate file.
Control Flow
- examples
- range
- Looping Techniques
- items()
- enumerate()
- zip()
items()
- dictionaries
knights = {'gallahad': 'the pure',
'robin': 'the brave'}
for k, v in knights.items():
print(k, v)
# gallahad the pure
# robin the braveenumerate()
- List
for i, v in enumerate(['tic', 'tac', 'toe']):
print(i, v)
# 0 tic
# 1 tac
# 2 toezip()
- Loop over two or more sequences at the same time
- Excelent tools to garant good algorith complex
questions = ['name', 'quest', 'favorite color']
answers = ['lancelot', 'the holy grail', 'blue']
for q, a in zip(questions, answers):
print('What is your {0}? It is {1}.'.format(q, a))
# What is your name? It is lancelot.
# What is your quest? It is the holy grail.
# What is your favorite color? It is blue.Functions
TODO:
-
examples
-
Optional arguments
-
Unpacking Argument (**kwargs)
Optional arguments
def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):Accepts one required argument (voltage) and three optional arguments (state, action, and type)
Unpacking Argument
def parrot(voltage, state='a stiff', action='voom'):
print("-- This parrot wouldn't", action, end=' ')
print("if you put", voltage, "volts through it.", end=' ')
print("E's", state, "!")
d = {"voltage": "four million",
"state": "bleedin' demised",
"action": "VOOM"}
parrot(**d)
# This parrot wouldn't VOOM if you put four million volts through it. E's bleedin' demised !Files
- o arquivo de saída padrão pode ser referenciado como
sys.stdout
Serialization
- Pickle
- sqlite
Classes
- Um namespace é um mapeamento de nomes para objetos. apenas ligam nomes a objetos
- verbos para métodos e substantivos para atributos de dados
- nada no Python torna possível impor a ocultação de dados
Examples:
class MyClass:
"""A simple example class"""
i = 12345
def f(self):
return 'hello world'self
- O primeiro argumento de um método é chamado self. Isso nada mais é do que uma convenção
- É útil para aumenta a legibilidade dos métodos: não há chance de confundir variáveis ​​locais e variáveis ​​de instância ao olhar através de um método.
class Bag:
def __init__(self):
self.data = []
def add(self, x):
self.data.append(x)
def addtwice(self, x):
self.add(x)
self.add(x)__class__
- Each value is an object. It is stored as
object.__class__
class MyFirstClass:
"""A simple example class"""
i = 42
def func_ex(self):
print('learning Python')
if __name__ == '__main__':
object = MyFirstClass() # initialized instance
object.func_ex()
print(object.__class__)
# learning Python
# <class '__main__.MyFirstClass'>Awesome Python by Category
A curated list of awesome Python frameworks, libraries, software and resources.
Based: https://awesome-python.com/
Admin Panels
Libraries for administrative interfaces.
- ajenti - The admin panel your servers deserve.
- django-grappelli - A jazzy skin for the Django Admin-Interface.
- django-jet - Modern responsive template for the Django admin interface with improved functionality.
- django-suit - Alternative Django Admin-Interface (free only for Non-commercial use).
- django-xadmin - Drop-in replacement of Django admin comes with lots of goodies.
- jet-bridge - Admin panel framework for any application with nice UI (ex Jet Django)
- flask-admin - Simple and extensible administrative interface framework for Flask.
- flower - Real-time monitor and web admin for Celery.
- wooey - A Django app which creates automatic web UIs for Python scripts.
Algorithms and Design Patterns
Python implementation of algorithms and design patterns.
- algorithms - Minimal examples of data structures and algorithms in Python.
- PyPattyrn - A simple yet effective library for implementing common design patterns.
- python-patterns - A collection of design patterns in Python.
- sortedcontainers - Fast, pure-Python implementation of SortedList, SortedDict, and SortedSet types.
Audio
Libraries for manipulating audio and its metadata.
- Audio
- audioread - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
- dejavu - Audio fingerprinting and recognition.
- mingus - An advanced music theory and notation package with MIDI file and playback support.
- pyAudioAnalysis - Audio feature extraction, classification, segmentation and applications.
- pydub - Manipulate audio with a simple and easy high level interface.
- TimeSide - Open web audio processing framework.
- Metadata
- beets - A music library manager and MusicBrainz tagger.
- eyeD3 - A tool for working with audio files, specifically MP3 files containing ID3 metadata.
- mutagen - A Python module to handle audio metadata.
- tinytag - A library for reading music meta data of MP3, OGG, FLAC and Wave files.
Authentication
Libraries for implementing authentications schemes.
- OAuth
- authlib - JavaScript Object Signing and Encryption draft implementation.
- django-allauth - Authentication app for Django that "just works."
- django-oauth-toolkit - OAuth 2 goodies for Django.
- oauthlib - A generic and thorough implementation of the OAuth request-signing logic.
- python-oauth2 - A fully tested, abstract interface to creating OAuth clients and servers.
- python-social-auth - An easy-to-setup social authentication mechanism.
- JWT
- pyjwt - JSON Web Token implementation in Python.
- python-jose - A JOSE implementation in Python.
- python-jwt - A module for generating and verifying JSON Web Tokens.
Build Tools
Compile software from source code.
- buildout - A build system for creating, assembling and deploying applications from multiple parts.
- PlatformIO - A console tool to build code with different development platforms.
- pybuilder - A continuous build tool written in pure Python.
- SCons - A software construction tool.
Built-in Classes Enhancement
Libraries for enhancing Python built-in classes.
- dataclasses - (Python standard library) Data classes.
- attrs - Replacement for
__init__,__eq__,__repr__, etc. boilerplate in class definitions. - bidict - Efficient, Pythonic bidirectional map data structures and related functionality..
- Box - Python dictionaries with advanced dot notation access.
- DottedDict - A library that provides a method of accessing lists and dicts with a dotted path notation.
CMS
Content Management Systems.
- wagtail - A Django content management system.
- django-cms - An Open source enterprise CMS based on the Django.
- feincms - One of the most advanced Content Management Systems built on Django.
- Kotti - A high-level, Pythonic web application framework built on Pyramid.
- mezzanine - A powerful, consistent, and flexible content management platform.
- plone - A CMS built on top of the open source application server Zope.
- quokka - Flexible, extensible, small CMS powered by Flask and MongoDB.
Caching
Libraries for caching data.
- beaker - A WSGI middleware for sessions and caching.
- django-cache-machine - Automatic caching and invalidation for Django models.
- django-cacheops - A slick ORM cache with automatic granular event-driven invalidation.
- dogpile.cache - dogpile.cache is next generation replacement for Beaker made by same authors.
- HermesCache - Python caching library with tag-based invalidation and dogpile effect prevention.
- pylibmc - A Python wrapper around the libmemcached interface.
- python-diskcache - SQLite and file backed cache backend with faster lookups than memcached and redis.
ChatOps Tools
Libraries for chatbot development.
- errbot - The easiest and most popular chatbot to implement ChatOps.
Code Analysis
Tools of static analysis, linters and code quality checkers. Also see awesome-static-analysis.
- Code Analysis
- coala - Language independent and easily extendable code analysis application.
- code2flow - Turn your Python and JavaScript code into DOT flowcharts.
- prospector - A tool to analyse Python code.
- pycallgraph - A library that visualises the flow (call graph) of your Python application.
- Code Linters
- flake8 - A wrapper around
pycodestyle,pyflakesand McCabe. - pylint - A fully customizable source code analyzer.
- pylama - A code audit tool for Python and JavaScript.
- wemake-python-styleguide - The strictest and most opinionated python linter ever.
- flake8 - A wrapper around
- Code Formatters
- Static Type Checkers, also see awesome-python-typing
- mypy - Check variable types during compile time.
- pyre-check - Performant type checking.
- Static Type Annotations Generators
- MonkeyType - A system for Python that generates static type annotations by collecting runtime types
Command-line Interface Development
Libraries for building command-line applications.
- Command-line Application Development
- cement - CLI Application Framework for Python.
- click - A package for creating beautiful command line interfaces in a composable way.
- cliff - A framework for creating command-line programs with multi-level commands.
- clint - Python Command-line Application Tools.
- docopt - Pythonic command line arguments parser.
- python-fire - A library for creating command line interfaces from absolutely any Python object.
- python-prompt-toolkit - A library for building powerful interactive command lines.
- Terminal Rendering
- asciimatics - A package to create full-screen text UIs (from interactive forms to ASCII animations).
- bashplotlib - Making basic plots in the terminal.
- colorama - Cross-platform colored terminal text.
- tqdm - Fast, extensible progress bar for loops and CLI.
Command-line Tools
Useful CLI-based tools for productivity.
- Productivity Tools
- cookiecutter - A command-line utility that creates projects from cookiecutters (project templates).
- doitlive - A tool for live presentations in the terminal.
- howdoi - Instant coding answers via the command line.
- PathPicker - Select files out of bash output.
- percol - Adds flavor of interactive selection to the traditional pipe concept on UNIX.
- thefuck - Correcting your previous console command.
- tmuxp - A tmux session manager.
- try - A dead simple CLI to try out python packages - it's never been easier.
- CLI Enhancements
- httpie - A command line HTTP client, a user-friendly cURL replacement.
- kube-shell - An integrated shell for working with the Kubernetes CLI.
- mycli - A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting.
- pgcli - Postgres CLI with autocompletion and syntax highlighting.
- saws - A Supercharged aws-cli.
Compatibility
Libraries for migrating from Python 2 to 3.
- python-future - The missing compatibility layer between Python 2 and Python 3.
- python-modernize - Modernizes Python code for eventual Python 3 migration.
- six - Python 2 and 3 compatibility utilities.
Computer Vision
Libraries for computer vision.
- OpenCV - Open Source Computer Vision Library.
- pytesseract - Another wrapper for Google Tesseract OCR.
- SimpleCV - An open source framework for building computer vision applications.
Concurrency and Parallelism
Libraries for concurrent and parallel execution. Also see awesome-asyncio.
- concurrent.futures - (Python standard library) A high-level interface for asynchronously executing callables.
- multiprocessing - (Python standard library) Process-based parallelism.
- eventlet - Asynchronous framework with WSGI support.
- gevent - A coroutine-based Python networking library that uses greenlet.
- uvloop - Ultra fast implementation of
asyncioevent loop on top oflibuv. - scoop - Scalable Concurrent Operations in Python.
Configuration
Libraries for storing and parsing configuration options.
- configobj - INI file parser with validation.
- configparser - (Python standard library) INI file parser.
- profig - Config from multiple formats with value conversion.
- python-decouple - Strict separation of settings from code.
Cryptography
- cryptography - A package designed to expose cryptographic primitives and recipes to Python developers.
- paramiko - The leading native Python SSHv2 protocol library.
- passlib - Secure password storage/hashing library, very high level.
- pynacl - Python binding to the Networking and Cryptography (NaCl) library.
Data Analysis
Libraries for data analyzing.
- Blaze - NumPy and Pandas interface to Big Data.
- Open Mining - Business Intelligence (BI) in Pandas interface.
- Orange - Data mining, data visualization, analysis and machine learning through visual programming or scripts.
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
- Optimus - Agile Data Science Workflows made easy with PySpark.
Data Validation
Libraries for validating data. Used for forms in many cases.
- Cerberus - A lightweight and extensible data validation library.
- colander - Validating and deserializing data obtained via XML, JSON, an HTML form post.
- jsonschema - An implementation of JSON Schema for Python.
- schema - A library for validating Python data structures.
- Schematics - Data Structure Validation.
- valideer - Lightweight extensible data validation and adaptation library.
- voluptuous - A Python data validation library.
Data Visualization
Libraries for visualizing data. Also see awesome-javascript.
- Altair - Declarative statistical visualization library for Python.
- Bokeh - Interactive Web Plotting for Python.
- bqplot - Interactive Plotting Library for the Jupyter Notebook
- Dash - Built on top of Flask, React and Plotly aimed at analytical web applications.
- plotnine - A grammar of graphics for Python based on ggplot2.
- Matplotlib - A Python 2D plotting library.
- Pygal - A Python SVG Charts Creator.
- PyGraphviz - Python interface to Graphviz.
- PyQtGraph - Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.
- Seaborn - Statistical data visualization using Matplotlib.
- VisPy - High-performance scientific visualization based on OpenGL.
Database
Databases implemented in Python.
- pickleDB - A simple and lightweight key-value store for Python.
- tinydb - A tiny, document-oriented database.
- ZODB - A native object database for Python. A key-value and object graph database.
Database Drivers
Libraries for connecting and operating databases.
- MySQL - awesome-mysql
- mysqlclient - MySQL connector with Python 3 support (mysql-python fork).
- PyMySQL - A pure Python MySQL driver compatible to mysql-python.
- PostgreSQL - awesome-postgres
- Other Relational Databases
- pymssql - A simple database interface to Microsoft SQL Server.
- SuperSQLite - A supercharged SQLite library built on top of apsw.
- NoSQL Databases
- cassandra-driver - The Python Driver for Apache Cassandra.
- happybase - A developer-friendly library for Apache HBase.
- kafka-python - The Python client for Apache Kafka.
- py2neo - A client library and toolkit for working with Neo4j.
- pymongo - The official Python client for MongoDB.
- redis-py - The Python client for Redis.
- Asynchronous Clients
- motor - The async Python driver for MongoDB.
Date and Time
Libraries for working with dates and times.
- Chronyk - A Python 3 library for parsing human-written times and dates.
- dateutil - Extensions to the standard Python datetime module.
- delorean - A library for clearing up the inconvenient truths that arise dealing with datetimes.
- moment - A Python library for dealing with dates/times. Inspired by Moment.js.
- Pendulum - Python datetimes made easy.
- PyTime - An easy-to-use Python module which aims to operate date/time/datetime by string.
- pytz - World timezone definitions, modern and historical. Brings the tz database into Python.
- when.py - Providing user-friendly functions to help perform common date and time actions.
- maya - Datetimes for Humans.
Debugging Tools
Libraries for debugging code.
- pdb-like Debugger
- Tracing
- lptrace - strace for Python programs.
- manhole - Debugging UNIX socket connections and present the stacktraces for all threads and an interactive prompt.
- pyringe - Debugger capable of attaching to and injecting code into Python processes.
- python-hunter - A flexible code tracing toolkit.
- Profiler
- line_profiler - Line-by-line profiling.
- memory_profiler - Monitor Memory usage of Python code.
- profiling - An interactive Python profiler.
- py-spy - A sampling profiler for Python programs. Written in Rust.
- pyflame - A ptracing profiler For Python.
- vprof - Visual Python profiler.
- Others
- icecream - Inspect variables, expressions, and program execution with a single, simple function call.
- django-debug-toolbar - Display various debug information for Django.
- django-devserver - A drop-in replacement for Django's runserver.
- flask-debugtoolbar - A port of the django-debug-toolbar to flask.
- pyelftools - Parsing and analyzing ELF files and DWARF debugging information.
Deep Learning
Frameworks for Neural Networks and Deep Learning. Also see awesome-deep-learning.
- caffe - A fast open framework for deep learning..
- keras - A high-level neural networks library and capable of running on top of either TensorFlow or Theano.
- mxnet - A deep learning framework designed for both efficiency and flexibility.
- pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration.
- SerpentAI - Game agent framework. Use any video game as a deep learning sandbox.
- tensorflow - The most popular Deep Learning framework created by Google.
- Theano - A library for fast numerical computation.
DevOps Tools
Software and libraries for DevOps.
- ansible - A radically simple IT automation platform.
- cloudinit - A multi-distribution package that handles early initialization of a cloud instance.
- cuisine - Chef-like functionality for Fabric.
- docker-compose - Fast, isolated development environments using Docker.
- fabric - A simple, Pythonic tool for remote execution and deployment.
- fabtools - Tools for writing awesome Fabric files.
- honcho - A Python clone of Foreman, for managing Procfile-based applications.
- OpenStack - Open source software for building private and public clouds.
- pexpect - Controlling interactive programs in a pseudo-terminal like GNU expect.
- psutil - A cross-platform process and system utilities module.
- saltstack - Infrastructure automation and management system.
- supervisor - Supervisor process control system for UNIX.
Distributed Computing
Frameworks and libraries for Distributed Computing.
- Batch Processing
- PySpark - Apache Spark Python API.
- dask - A flexible parallel computing library for analytic computing.
- luigi - A module that helps you build complex pipelines of batch jobs.
- mrjob - Run MapReduce jobs on Hadoop or Amazon Web Services.
- Ray - A system for parallel and distributed Python that unifies the machine learning ecosystem.
- Stream Processing
- faust - A stream processing library, porting the ideas from Kafka Streams to Python.
- streamparse - Run Python code against real-time streams of data via Apache Storm.
Distribution
Libraries to create packaged executables for release distribution.
- dh-virtualenv - Build and distribute a virtualenv as a Debian package.
- Nuitka - Compile scripts, modules, packages to an executable or extension module.
- py2app - Freezes Python scripts (Mac OS X).
- py2exe - Freezes Python scripts (Windows).
- PyInstaller - Converts Python programs into stand-alone executables (cross-platform).
- pynsist - A tool to build Windows installers, installers bundle Python itself.
Documentation
Libraries for generating project documentation.
- sphinx - Python Documentation generator.
- pdoc - Epydoc replacement to auto generate API documentation for Python libraries.
- pycco - The literate-programming-style documentation generator.
Downloader
Libraries for downloading.
- s3cmd - A command line tool for managing Amazon S3 and CloudFront.
- s4cmd - Super S3 command line tool, good for higher performance.
- you-get - A YouTube/Youku/Niconico video downloader written in Python 3.
- youtube-dl - A small command-line program to download videos from YouTube.
E-commerce
Frameworks and libraries for e-commerce and payments.
- alipay - Unofficial Alipay API for Python.
- Cartridge - A shopping cart app built using the Mezzanine.
- django-oscar - An open-source e-commerce framework for Django.
- django-shop - A Django based shop system.
- merchant - A Django app to accept payments from various payment processors.
- money -
Moneyclass with optional CLDR-backed locale-aware formatting and an extensible currency exchange. - python-currencies - Display money format and its filthy currencies.
- forex-python - Foreign exchange rates, Bitcoin price index and currency conversion.
- saleor - An e-commerce storefront for Django.
- shoop - An open source E-Commerce platform based on Django.
Editor Plugins and IDEs
- Emacs
- elpy - Emacs Python Development Environment.
- Sublime Text
- anaconda - Anaconda turns your Sublime Text 3 in a full featured Python development IDE.
- SublimeJEDI - A Sublime Text plugin to the awesome auto-complete library Jedi.
- Vim
- jedi-vim - Vim bindings for the Jedi auto-completion library for Python.
- python-mode - An all in one plugin for turning Vim into a Python IDE.
- YouCompleteMe - Includes Jedi-based completion engine for Python.
- Visual Studio
- PTVS - Python Tools for Visual Studio.
- Visual Studio Code
- Python - The official VSCode extension with rich support for Python.
- IDE
Libraries for sending and parsing email.
- envelopes - Mailing for human beings.
- flanker - An email address and Mime parsing library.
- imbox - Python IMAP for Humans.
- inbox.py - Python SMTP Server for Humans.
- lamson - Pythonic SMTP Application Server.
- Marrow Mailer - High-performance extensible mail delivery framework.
- modoboa - A mail hosting and management platform including a modern and simplified Web UI.
- Nylas Sync Engine - Providing a RESTful API on top of a powerful email sync platform.
- yagmail - Yet another Gmail/SMTP client.
Environment Management
Libraries for Python version and virtual environment management.
- pip - Simple Python version management.
- pipenv - Python Development Workflow for Humans. Good
- poetry - Python dependency management and packaging made easy.
- virtualenv - A tool to create isolated Python environments.
Files
Libraries for file manipulation and MIME type detection.
- mimetypes - (Python standard library) Map filenames to MIME types.
- path.py - A module wrapper for os.path.
- pathlib - (Python standard library) An cross-platform, object-oriented path library.
- PyFilesystem2 - Python's filesystem abstraction layer.
- python-magic - A Python interface to the libmagic file type identification library.
- Unipath - An object-oriented approach to file/directory operations.
- watchdog - API and shell utilities to monitor file system events.
Forms
Libraries for working with forms.
- Deform - Python HTML form generation library influenced by the formish form generation library.
- django-bootstrap3 - Bootstrap 3 integration with Django.
- django-bootstrap4 - Bootstrap 4 integration with Django.
- django-crispy-forms - A Django app which lets you create beautiful forms in a very elegant and DRY way.
- django-remote-forms - A platform independent Django form serializer.
- WTForms - A flexible forms validation and rendering library.
Functional Programming
Functional Programming with Python.
- Coconut - Coconut is a variant of Python built for simple, elegant, Pythonic functional programming.
- CyToolz - Cython implementation of Toolz: High performance functional utilities.
- fn.py - Functional programming in Python: implementation of missing features to enjoy FP.
- funcy - A fancy and practical functional tools.
- Toolz - A collection of functional utilities for iterators, functions, and dictionaries.
GUI Development
Libraries for working with graphical user interface applications.
- curses - Built-in wrapper for ncurses used to create terminal GUI applications.
- Eel - A library for making simple Electron-like offline HTML/JS GUI apps.
- enaml - Creating beautiful user-interfaces with Declarative Syntax like QML.
- Flexx - Flexx is a pure Python toolkit for creating GUI's, that uses web technology for its rendering.
- Gooey - Turn command line programs into a full GUI application with one line.
- kivy - A library for creating NUI applications, running on Windows, Linux, Mac OS X, Android and iOS.
- pyglet - A cross-platform windowing and multimedia library for Python.
- PyGObject - Python Bindings for GLib/GObject/GIO/GTK+ (GTK+3).
- PyQt - Python bindings for the Qt cross-platform application and UI framework.
- PySimpleGUI - Wrapper for tkinter, Qt, WxPython and Remi.
- pywebview - A lightweight cross-platform native wrapper around a webview component.
- Tkinter - Tkinter is Python's de-facto standard GUI package.
- Toga - A Python native, OS native GUI toolkit.
- urwid - A library for creating terminal GUI applications with strong support for widgets, events, rich colors, etc.
- wxPython - A blending of the wxWidgets C++ class library with the Python.
Game Development
Awesome game development libraries.
- Cocos2d - cocos2d is a framework for building 2D games, demos, and other graphical/interactive applications.
- Harfang3D - Python framework for 3D, VR and game development.
- Panda3D - 3D game engine developed by Disney.
- Pygame - Pygame is a set of Python modules designed for writing games.
- PyOgre - Python bindings for the Ogre 3D render engine, can be used for games, simulations, anything 3D.
- PyOpenGL - Python ctypes bindings for OpenGL and it's related APIs.
- PySDL2 - A ctypes based wrapper for the SDL2 library.
- RenPy - A Visual Novel engine.
Geolocation
Libraries for geocoding addresses and working with latitudes and longitudes.
- django-countries - A Django app that provides a country field for models and forms.
- GeoDjango - A world-class geographic web framework.
- GeoIP - Python API for MaxMind GeoIP Legacy Database.
- geojson - Python bindings and utilities for GeoJSON.
- geopy - Python Geocoding Toolbox.
- pygeoip - Pure Python GeoIP API.
HTML Manipulation
Libraries for working with HTML and XML.
- BeautifulSoup - Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.
- bleach - A whitelist-based HTML sanitization and text linkification library.
- cssutils - A CSS library for Python.
- html5lib - A standards-compliant library for parsing and serializing HTML documents and fragments.
- lxml - A very fast, easy-to-use and versatile library for handling HTML and XML.
- MarkupSafe - Implements a XML/HTML/XHTML Markup safe string for Python.
- pyquery - A jQuery-like library for parsing HTML.
- untangle - Converts XML documents to Python objects for easy access.
- WeasyPrint - A visual rendering engine for HTML and CSS that can export to PDF.
- xmldataset - Simple XML Parsing.
- xmltodict - Working with XML feel like you are working with JSON.
HTTP Clients
Libraries for working with HTTP.
- grequests - requests + gevent for asynchronous HTTP requests.
- httplib2 - Comprehensive HTTP client library.
- requests - HTTP Requests for Humans™.
- treq - Python requests like API built on top of Twisted's HTTP client.
- urllib3 - A HTTP library with thread-safe connection pooling, file post support, sanity friendly.
Hardware
Libraries for programming with hardware.
- ino - Command line toolkit for working with Arduino.
- keyboard - Hook and simulate global keyboard events on Windows and Linux.
- mouse - Hook and simulate global mouse events on Windows and Linux.
- Pingo - Pingo provides a uniform API to program devices like the Raspberry Pi, pcDuino, Intel Galileo, etc.
- PyUserInput - A module for cross-platform control of the mouse and keyboard.
- scapy - A brilliant packet manipulation library.
- wifi - A Python library and command line tool for working with WiFi on Linux.
Image Processing
Libraries for manipulating images.
- hmap - Image histogram remapping.
- imgSeek - A project for searching a collection of images using visual similarity.
- nude.py - Nudity detection.
- pagan - Retro identicon (Avatar) generation based on input string and hash.
- pillow - Pillow is the friendly PIL fork.
- pyBarcode - Create barcodes in Python without needing PIL.
- pygram - Instagram-like image filters.
- python-qrcode - A pure Python QR Code generator.
- Quads - Computer art based on quadtrees.
- scikit-image - A Python library for (scientific) image processing.
- thumbor - A smart imaging service. It enables on-demand crop, re-sizing and flipping of images.
- wand - Python bindings for MagickWand, C API for ImageMagick.
Implementations
Implementations of Python.
- CPython - Default, most widely used implementation of the Python programming language written in C.
- Cython - Optimizing Static Compiler for Python.
- CLPython - Implementation of the Python programming language written in Common Lisp.
- Grumpy - More compiler than interpreter as more powerful CPython2.7 replacement (alpha).
- IronPython - Implementation of the Python programming language written in C#.
- Jython - Implementation of Python programming language written in Java for the JVM.
- MicroPython - A lean and efficient Python programming language implementation.
- Numba - Python JIT compiler to LLVM aimed at scientific Python.
- PeachPy - x86-64 assembler embedded in Python.
- Pyjion - A JIT for Python based upon CoreCLR.
- PyPy - A very fast and compliant implementation of the Python language.
- Pyston - A Python implementation using JIT techniques.
- Stackless Python - An enhanced version of the Python programming language.
Interactive Interpreter
Interactive Python interpreters (REPL).
- bpython - A fancy interface to the Python interpreter.
- Jupyter Notebook (IPython) - A rich toolkit to help you make the most out of using Python interactively.
- ptpython - Advanced Python REPL built on top of the python-prompt-toolkit.
Job Scheduler
Libraries for scheduling jobs.
- APScheduler - A light but powerful in-process task scheduler that lets you schedule functions.
- django-schedule - A calendaring app for Django.
- doit - A task runner and build tool.
- gunnery - Multipurpose task execution tool for distributed systems with web-based interface.
- Joblib - A set of tools to provide lightweight pipelining in Python.
- Plan - Writing crontab file in Python like a charm.
- schedule - Python job scheduling for humans.
- Spiff - A powerful workflow engine implemented in pure Python.
- TaskFlow - A Python library that helps to make task execution easy, consistent and reliable.
- Airflow - Airflow is a platform to programmatically author, schedule and monitor workflows.
Logging
Libraries for generating and working with logs.
- Eliot - Logging for complex & distributed systems.
- logbook - Logging replacement for Python.
- logging - (Python standard library) Logging facility for Python.
- raven - Python client for Sentry, a log/error tracking, crash reporting and aggregation platform for web applications.
Machine Learning
Awesome more complete HERE. Contains libraries, blogs, books, courses, events, meetups.
Microsoft Windows
Python programming on Microsoft Windows.
- Python(x,y) - Scientific-applications-oriented Python Distribution based on Qt and Spyder.
- pythonlibs - Unofficial Windows binaries for Python extension packages.
- PythonNet - Python Integration with the .NET Common Language Runtime (CLR).
- PyWin32 - Python Extensions for Windows.
- WinPython - Portable development environment for Windows 7/8.
Miscellaneous
Useful libraries or tools that don't fit in the categories above.
- boltons - A set of pure-Python utilities.
- itsdangerous - Various helpers to pass trusted data to untrusted environments.
Natural Language Processing
Libraries for working with human languages.
- General
- gensim - Topic Modeling for Humans.
- langid.py - Stand-alone language identification system.
- nltk - A leading platform for building Python programs to work with human language data.
- pattern - A web mining module for the Python.
- polyglot - Natural language pipeline supporting hundreds of languages.
- pytext - A natural language modeling framework based on PyTorch.
- PyTorch-NLP - A toolkit enabling rapid deep learning NLP prototyping for research.
- spacy - A library for industrial-strength natural language processing in Python and Cython.
- stanfordnlp - The Stanford NLP Group's official Python library, supporting 50+ languages.
- Chinese
- jieba - The most popular Chinese text segmentation library.
- pkuseg-python - A toolkit for Chinese word segmentation in various domains.
- snownlp - A library for processing Chinese text.
- funNLP - A collection of tools and datasets for Chinese NLP.
Network Virtualization
Tools and libraries for Virtual Networking and SDN (Software Defined Networking).
- mininet - A popular network emulator and API written in Python.
- pox - A Python-based SDN control applications, such as OpenFlow SDN controllers.
Networking
Libraries for networking programming.
- asyncio - (Python standard library) Asynchronous I/O, event loop, coroutines and tasks.
- pulsar - Event-driven concurrent framework for Python.
- pyzmq - A Python wrapper for the ZeroMQ message library.
- Twisted - An event-driven networking engine.
- napalm - Cross-vendor API to manipulate network devices.
News Feed
Libraries for building user's activities.
- django-activity-stream - Generating generic activity streams from the actions on your site.
- Stream Framework - Building news feed and notification systems using Cassandra and Redis.
ORM
Libraries that implement Object-Relational Mapping or data mapping techniques.
- Relational Databases
- Django Models - A part of Django.
- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper.
- dataset - Store Python dicts in a database - works with SQLite, MySQL, and PostgreSQL.
- orator - The Orator ORM provides a simple yet beautiful ActiveRecord implementation.
- orm - An async ORM.
- peewee - A small, expressive ORM.
- pony - ORM that provides a generator-oriented interface to SQL.
- pydal - A pure Python Database Abstraction Layer.
- NoSQL Databases
- hot-redis - Rich Python data types for Redis.
- mongoengine - A Python Object-Document-Mapper for working with MongoDB.
- PynamoDB - A Pythonic interface for Amazon DynamoDB.
- redisco - A Python Library for Simple Models and Containers Persisted in Redis.
Package Management
Libraries for package and dependency management.
- pip - The Python package and dependency manager.
- conda - Cross-platform, Python-agnostic binary package manager.
Package Repositories
Local PyPI repository server and proxies.
- warehouse - Next generation Python Package Repository (PyPI).
- bandersnatch - PyPI mirroring tool provided by Python Packaging Authority (PyPA).
- devpi - PyPI server and packaging/testing/release tool.
- localshop - Local PyPI server (custom packages and auto-mirroring of pypi).
Permissions
Libraries that allow or deny users access to data or functionality.
- django-guardian - Implementation of per object permissions for Django 1.2+
- django-rules - A tiny but powerful app providing object-level permissions to Django, without requiring a database.
Processes
Libraries for starting and communicating with OS processes.
- delegator.py - Subprocesses for Humans™ 2.0.
- sarge - Yet another wrapper for subprocess.
- sh - A full-fledged subprocess replacement for Python.
Queue
Libraries for working with event and task queues.
- celery - An asynchronous task queue/job queue based on distributed message passing.
- huey - Little multi-threaded task queue.
- mrq - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.
- rq - Simple job queues for Python.
Recommender Systems
Libraries for building recommender systems.
- annoy - Approximate Nearest Neighbors in C++/Python optimized for memory usage.
- fastFM - A library for Factorization Machines.
- implicit - A fast Python implementation of collaborative filtering for implicit datasets.
- libffm - A library for Field-aware Factorization Machine (FFM).
- lightfm - A Python implementation of a number of popular recommendation algorithms.
- spotlight - Deep recommender models using PyTorch.
- Surprise - A scikit for building and analyzing recommender systems.
- tensorrec - A Recommendation Engine Framework in TensorFlow.
RESTful API
Libraries for developing RESTful APIs.
- Django
- django-rest-framework - A powerful and flexible toolkit to build web APIs.
- django-tastypie - Creating delicious APIs for Django apps.
- Flask
- eve - REST API framework powered by Flask, MongoDB and good intentions.
- flask-api-utils - Taking care of API representation and authentication for Flask.
- flask-api - Browsable Web APIs for Flask.
- flask-restful - Quickly building REST APIs for Flask.
- flask-restless - Generating RESTful APIs for database models defined with SQLAlchemy.
- Pyramid
- cornice - A RESTful framework for Pyramid.
- Framework agnostic
- apistar - A smart Web API framework, designed for Python 3.
- falcon - A high-performance framework for building cloud APIs and web app backends.
- hug - A Python 3 framework for cleanly exposing APIs.
- restless - Framework agnostic REST framework based on lessons learned from Tastypie.
- ripozo - Quickly creating REST/HATEOAS/Hypermedia APIs.
- sandman - Automated REST APIs for existing database-driven systems.
Robotics
Libraries for robotics.
- PythonRobotics - This is a compilation of various robotics algorithms with visualizations.
- rospy - This is a library for ROS (Robot Operating System).
RPC Servers
RPC-compatible servers.
- SimpleJSONRPCServer - This library is an implementation of the JSON-RPC specification.
- SimpleXMLRPCServer - (Python standard library) Simple XML-RPC server implementation, single-threaded.
- zeroRPC - zerorpc is a flexible RPC implementation based on ZeroMQ and MessagePack.
Science
Libraries for scientific computing. Also see Python-for-Scientists
- astropy - A community Python library for Astronomy.
- bcbio-nextgen - Providing best-practice pipelines for fully automated high throughput sequencing analysis.
- bccb - Collection of useful code related to biological analysis.
- Biopython - Biopython is a set of freely available tools for biological computation.
- cclib - A library for parsing and interpreting the results of computational chemistry packages.
- Colour - Implementing a comprehensive number of colour theory transformations and algorithms.
- NetworkX - A high-productivity software for complex networks.
- NIPY - A collection of neuroimaging toolkits.
- NumPy - A fundamental package for scientific computing with Python.
- Open Babel - A chemical toolbox designed to speak the many languages of chemical data.
- ObsPy - A Python toolbox for seismology.
- PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion.
- PyMC - Markov Chain Monte Carlo sampling toolkit.
- QuTiP - Quantum Toolbox in Python.
- RDKit - Cheminformatics and Machine Learning Software.
- SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- statsmodels - Statistical modeling and econometrics in Python.
- SymPy - A Python library for symbolic mathematics.
- Zipline - A Pythonic algorithmic trading library.
- SimPy - A process-based discrete-event simulation framework.
Search
Libraries and software for indexing and performing search queries on data.
- elasticsearch-py - The official low-level Python client for Elasticsearch.
- elasticsearch-dsl-py - The official high-level Python client for Elasticsearch.
- django-haystack - Modular search for Django.
- pysolr - A lightweight Python wrapper for Apache Solr.
- whoosh - A fast, pure Python search engine library.
Serialization
Libraries for serializing complex data types
- marshmallow - A lightweight library for converting complex objects to and from simple Python datatypes.
- pysimdjson - A Python bindings for simdjson.
- python-rapidjson - A Python wrapper around RapidJSON.
- ultrajson - A fast JSON decoder and encoder written in C with Python bindings.
Serverless Frameworks
Frameworks for developing serverless Python code.
- python-lambda - A toolkit for developing and deploying Python code in AWS Lambda.
- Zappa - A tool for deploying WSGI applications on AWS Lambda and API Gateway.
Libraries to Management Services in Cloud
Parsing
Libraries for parsing and manipulating specific text formats.
- General
- tablib - A module for Tabular Datasets in XLS, CSV, JSON, YAML.
- Office
- openpyxl - A library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.
- pyexcel - Providing one API for reading, manipulating and writing csv, ods, xls, xlsx and xlsm files.
- python-docx - Reads, queries and modifies Microsoft Word 2007/2008 docx files.
- python-pptx - Python library for creating and updating PowerPoint (.pptx) files.
- unoconv - Convert between any document format supported by LibreOffice/OpenOffice.
- XlsxWriter - A Python module for creating Excel .xlsx files.
- xlwings - A BSD-licensed library that makes it easy to call Python from Excel and vice versa.
- xlwt / xlrd - Writing and reading data and formatting information from Excel files.
- Markdown
- Mistune - Fastest and full featured pure Python parsers of Markdown.
- Python-Markdown - A Python implementation of John Gruber’s Markdown.
- YAML
- PyYAML - YAML implementations for Python.
- CSV
- csvkit - Utilities for converting to and working with CSV.
- Archive
- unp - A command line tool that can unpack archives easily.
Static Site Generator
Static site generator is a software that takes some text + templates as input and produces HTML files on the output.
- mkdocs - Markdown friendly documentation generator.
- pelican - Static site generator that supports Markdown and reST syntax.
- lektor - An easy to use static CMS and blog engine.
- nikola - A static website and blog generator.
Tagging
Libraries for tagging items.
- django-taggit - Simple tagging for Django.
Template Engine
Libraries and tools for templating and lexing.
- Jinja2 - A modern and designer friendly templating language.
- Genshi - Python templating toolkit for generation of web-aware output.
- Mako - Hyperfast and lightweight templating for the Python platform.
Testing
Libraries for testing codebases and generating test data.
- Testing Frameworks
- pytest - A mature full-featured Python testing tool.
- hypothesis - Hypothesis is an advanced Quickcheck style property based testing library.
- nose2 - The successor to
nose, based on `unittest2. - Robot Framework - A generic test automation framework.
- unittest - (Python standard library) Unit testing framework.
- Test Runners
- GUI / Web Testing
- locust - Scalable user load testing tool written in Python.
- PyAutoGUI - PyAutoGUI is a cross-platform GUI automation Python module for human beings.
- Selenium - Python bindings for Selenium WebDriver.
- sixpack - A language-agnostic A/B Testing framework.
- splinter - Open source tool for testing web applications.
- Mock
- mock - (Python standard library) A mocking and patching library.
- doublex - Powerful test doubles framework for Python.
- freezegun - Travel through time by mocking the datetime module.
- httmock - A mocking library for requests for Python 2.6+ and 3.2+.
- httpretty - HTTP request mock tool for Python.
- mocket - A socket mock framework with gevent/asyncio/SSL support.
- responses - A utility library for mocking out the requests Python library.
- VCR.py - Record and replay HTTP interactions on your tests.
- Object Factories
- factory_boy - A test fixtures replacement for Python.
- mixer - Another fixtures replacement. Supported Django, Flask, SQLAlchemy, Peewee and etc.
- model_mommy - Creating random fixtures for testing in Django.
- Code Coverage
- coverage - Code coverage measurement.
- Fake Data
General-Purpose Machine Learning
- Little Ball of Fur -> A graph sampling extension library for NetworkX with a Scikit-Learn like API.
- Karate Club -> An unsupervised machine learning extension library for NetworkX with a Scikit-Learn like API.
- Auto_ViML -> Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal", is a comprehensive and scalable Python AutoML toolkit with imbalanced handling, ensembling, stacking and built-in feature selection. Featured in Medium article.
- PyOD -> Python Outlier Detection, comprehensive and scalable Python toolkit for detecting outlying objects in multivariate data. Featured for Advanced models, including Neural Networks/Deep Learning and Outlier Ensembles.
- steppy -> Lightweight, Python library for fast and reproducible machine learning experimentation. Introduces very simple interface that enables clean machine learning pipeline design.
- steppy-toolkit -> Curated collection of the neural networks, transformers and models that make your machine learning work faster and more effective.
- CNTK - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit. Documentation can be found here.
- auto_ml - Automated machine learning for production and analytics. Lets you focus on the fun parts of ML, while outputting production-ready code, and detailed analytics of your dataset and results. Includes support for NLP, XGBoost, CatBoost, LightGBM, and soon, deep learning.
- machine learning - automated build consisting of a web-interface, and set of programmatic-interface API, for support vector machines. Corresponding dataset(s) are stored into a SQL database, then generated model(s) used for prediction(s), are stored into a NoSQL datastore.
- XGBoost - Python bindings for eXtreme Gradient Boosting (Tree) Library.
- Apache SINGA - An Apache Incubating project for developing an open source machine learning library.
- Bayesian Methods for Hackers - Book/iPython notebooks on Probabilistic Programming in Python.
- Featureforge A set of tools for creating and testing machine learning features, with a scikit-learn compatible API.
- MLlib in Apache Spark - Distributed machine learning library in Spark
- Hydrosphere Mist - a service for deployment Apache Spark MLLib machine learning models as realtime, batch or reactive web services.
- scikit-learn - A Python module for machine learning built on top of SciPy.
- metric-learn - A Python module for metric learning.
- SimpleAI Python implementation of many of the artificial intelligence algorithms described on the book "Artificial Intelligence, a Modern Approach". It focuses on providing an easy to use, well documented and tested library.
- astroML - Machine Learning and Data Mining for Astronomy.
- graphlab-create - A library with various machine learning models (regression, clustering, recommender systems, graph analytics, etc.) implemented on top of a disk-backed DataFrame.
- BigML - A library that contacts external servers.
- pattern - Web mining module for Python.
- NuPIC - Numenta Platform for Intelligent Computing.
- Pylearn2 - A Machine Learning library based on Theano. [Deprecated]
- keras - High-level neural networks frontend for TensorFlow, CNTK and Theano.
- Lasagne - Lightweight library to build and train neural networks in Theano.
- hebel - GPU-Accelerated Deep Learning Library in Python. [Deprecated]
- Chainer - Flexible neural network framework.
- prophet - Fast and automated time series forecasting framework by Facebook.
- gensim - Topic Modelling for Humans.
- topik - Topic modelling toolkit. [Deprecated]
- PyBrain - Another Python Machine Learning Library.
- Brainstorm - Fast, flexible and fun neural networks. This is the successor of PyBrain.
- Surprise - A scikit for building and analyzing recommender systems.
- implicit - Fast Python Collaborative Filtering for Implicit Datasets.
- LightFM - A Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback.
- Crab - A flexible, fast recommender engine. [Deprecated]
- python-recsys - A Python library for implementing a Recommender System.
- thinking bayes - Book on Bayesian Analysis.
- Image-to-Image Translation with Conditional Adversarial Networks - Implementation of image to image (pix2pix) translation from the paper by isola et al.[DEEP LEARNING]
- Restricted Boltzmann Machines -Restricted Boltzmann Machines in Python. [DEEP LEARNING]
- Bolt - Bolt Online Learning Toolbox. [Deprecated]
- CoverTree - Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree [Deprecated]
- nilearn - Machine learning for NeuroImaging in Python.
- neuropredict - Aimed at novice machine learners and non-expert programmers, this package offers easy (no coding needed) and comprehensive machine learning (evaluation and full report of predictive performance WITHOUT requiring you to code) in Python for NeuroImaging and any other type of features. This is aimed at absorbing the much of the ML workflow, unlike other packages like nilearn and pymvpa, which require you to learn their API and code to produce anything useful.
- imbalanced-learn - Python module to perform under sampling and over sampling with various techniques.
- Shogun - The Shogun Machine Learning Toolbox.
- Pyevolve - Genetic algorithm framework. [Deprecated]
- Caffe - A deep learning framework developed with cleanliness, readability, and speed in mind.
- breze - Theano based library for deep and recurrent neural networks.
- Cortex - Open source platform for deploying machine learning models in production.
- pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.
- mrjob - A library to let Python program run on Hadoop.
- SKLL - A wrapper around scikit-learn that makes it simpler to conduct experiments.
- neurolab
- Spearmint - Spearmint is a package to perform Bayesian optimization according to the algorithms outlined in the paper: Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Larochelle and Ryan P. Adams. Advances in Neural Information Processing Systems, 2012. [Deprecated]
- Pebl - Python Environment for Bayesian Learning. [Deprecated]
- Theano - Optimizing GPU-meta-programming code generating array oriented optimizing math compiler in Python.
- TensorFlow - Open source software library for numerical computation using data flow graphs.
- pomegranate - Hidden Markov Models for Python, implemented in Cython for speed and efficiency.
- python-timbl - A Python extension module wrapping the full TiMBL C++ programming interface. Timbl is an elaborate k-Nearest Neighbours machine learning toolkit.
- deap - Evolutionary algorithm framework.
- pydeep - Deep Learning In Python. [Deprecated]
- mlxtend - A library consisting of useful tools for data science and machine learning tasks.
- neon - Nervana's high-performance Python-based Deep Learning framework [DEEP LEARNING]. [Deprecated]
- Optunity - A library dedicated to automated hyperparameter optimization with a simple, lightweight API to facilitate drop-in replacement of grid search.
- Neural Networks and Deep Learning - Code samples for my book "Neural Networks and Deep Learning" [DEEP LEARNING].
- Annoy - Approximate nearest neighbours implementation.
- TPOT - Tool that automatically creates and optimizes machine learning pipelines using genetic programming. Consider it your personal data science assistant, automating a tedious part of machine learning.
- pgmpy A python library for working with Probabilistic Graphical Models.
- DIGITS - The Deep Learning GPU Training System (DIGITS) is a web application for training deep learning models.
- Orange - Open source data visualization and data analysis for novices and experts.
- MXNet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more.
- milk - Machine learning toolkit focused on supervised classification. [Deprecated]
- TFLearn - Deep learning library featuring a higher-level API for TensorFlow.
- REP - an IPython-based environment for conducting data-driven research in a consistent and reproducible way. REP is not trying to substitute scikit-learn, but extends it and provides better user experience. [Deprecated]
- rgf_python - Python bindings for Regularized Greedy Forest (Tree) Library.
- skbayes - Python package for Bayesian Machine Learning with scikit-learn API.
- fuku-ml - Simple machine learning library, including Perceptron, Regression, Support Vector Machine, Decision Tree and more, it's easy to use and easy to learn for beginners.
- Xcessiv - A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling.
- PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
- ML-From-Scratch - Implementations of Machine Learning models from scratch in Python with a focus on transparency. Aims to showcase the nuts and bolts of ML in an accessible way.
- Edward - A library for probabilistic modeling, inference, and criticism. Built on top of TensorFlow.
- xRBM - A library for Restricted Boltzmann Machine (RBM) and its conditional variants in Tensorflow.
- CatBoost - General purpose gradient boosting on decision trees library with categorical features support out of the box. It is easy to install, well documented and supports CPU and GPU (even multi-GPU) computation.
- stacked_generalization - Implementation of machine learning stacking technic as handy library in Python.
- modAL - A modular active learning framework for Python, built on top of scikit-learn.
- Cogitare: A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python.
- Parris - Parris, the automated infrastructure setup tool for machine learning algorithms.
- neonrvm - neonrvm is an open source machine learning library based on RVM technique. It's written in C programming language and comes with Python programming language bindings.
- Turi Create - Machine learning from Apple. Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app.
- xLearn - A high performance, easy-to-use, and scalable machine learning package, which can be used to solve large-scale machine learning problems. xLearn is especially useful for solving machine learning problems on large-scale sparse data, which is very common in Internet services such as online advertisement and recommender systems.
- mlens - A high performance, memory efficient, maximally parallelized ensemble learning, integrated with scikit-learn.
- Netron - Visualizer for machine learning models.
- Thampi - Machine Learning Prediction System on AWS Lambda
- MindsDB - Open Source framework to streamline use of neural networks.
- Microsoft Recommenders: Examples and best practices for building recommendation systems, provided as Jupyter notebooks. The repo contains some of the latest state of the art algorithms from Microsoft Research as well as from other companies and institutions.
- StellarGraph: Machine Learning on Graphs, a Python library for machine learning on graph-structured (network-structured) data.
- BentoML: Toolkit for package and deploy machine learning models for serving in production
- MiraiML: An asynchronous engine for continuous & autonomous machine learning, built for real-time usage.
- numpy-ML: Reference implementations of ML models written in numpy
- creme: A framework for online machine learning.
- Neuraxle: A framework providing the right abstractions to ease research, development, and deployment of your ML pipelines.
- Cornac - A comparative framework for multimodal recommender systems with a focus on models leveraging auxiliary data.
- JAX - JAX is Autograd and XLA, brought together for high-performance machine learning research.
- Catalyst - High-level utils for PyTorch DL & RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing. Being able to research/develop something new, rather than write another regular train loop.
- Fastai - High-level wrapper built on the top of Pytorch which supports vision, text, tabular data and collaborative filtering.
- scikit-multiflow - A machine learning framework for multi-output/multi-label and stream data.
- Lightwood - A Pytorch based framework that breaks down machine learning problems into smaller blocks that can be glued together seamlessly with objective to build predictive models with one line of code.
- bayeso - A simple, but essential Bayesian optimization package, written in Python.
- mljar-supervised - An Automated Machine Learning (AutoML) python package for tabular data. It can handle: Binary Classification, MultiClass Classification and Regression. It provides explanations and markdown reports.
Data Analysis / Data Visualization
- SciPy - A Python-based ecosystem of open-source software for mathematics, science, and engineering.
- NumPy - A fundamental package for scientific computing with Python.
- AutoViz AutoViz performs automatic visualization of any dataset with a single line of Python code. Give it any input file (CSV, txt or json) of any size and AutoViz will visualize it. See Medium article.
- Numba - Python JIT (just in time) compiler to LLVM aimed at scientific Python by the developers of Cython and NumPy.
- Mars - A tensor-based framework for large-scale data computation which often regarded as a parallel and distributed version of NumPy.
- NetworkX - A high-productivity software for complex networks.
- igraph - binding to igraph library - General purpose graph library.
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
- Open Mining - Business Intelligence (BI) in Python (Pandas web interface) [Deprecated]
- PyMC - Markov Chain Monte Carlo sampling toolkit.
- zipline - A Pythonic algorithmic trading library.
- PyDy - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.
- SymPy - A Python library for symbolic mathematics.
- statsmodels - Statistical modeling and econometrics in Python.
- astropy - A community Python library for Astronomy.
- matplotlib - A Python 2D plotting library.
- bokeh - Interactive Web Plotting for Python.
- plotly - Collaborative web plotting for Python and matplotlib.
- altair - A Python to Vega translator.
- d3py - A plotting library for Python, based on D3.js.
- PyDexter - Simple plotting for Python. Wrapper for D3xterjs; easily render charts in-browser.
- ggplot - Same API as ggplot2 for R. [Deprecated]
- ggfortify - Unified interface to ggplot2 popular R packages.
- Kartograph.py - Rendering beautiful SVG maps in Python.
- pygal - A Python SVG Charts Creator.
- PyQtGraph - A pure-python graphics and GUI library built on PyQt4 / PySide and NumPy.
- pycascading [Deprecated]
- Petrel - Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python.
- Blaze - NumPy and Pandas interface to Big Data.
- emcee - The Python ensemble sampling toolkit for affine-invariant MCMC.
- windML - A Python Framework for Wind Energy Analysis and Prediction.
- vispy - GPU-based high-performance interactive OpenGL 2D/3D data visualization library.
- cerebro2 A web-based visualization and debugging platform for NuPIC. [Deprecated]
- NuPIC Studio An all-in-one NuPIC Hierarchical Temporal Memory visualization and debugging super-tool! [Deprecated]
- SparklingPandas Pandas on PySpark (POPS).
- Seaborn - A python visualization library based on matplotlib.
- bqplot - An API for plotting in Jupyter (IPython).
- pastalog - Simple, realtime visualization of neural network training performance.
- Superset - A data exploration platform designed to be visual, intuitive, and interactive.
- Dora - Tools for exploratory data analysis in Python.
- Ruffus - Computation Pipeline library for python.
- SOMPY - Self Organizing Map written in Python (Uses neural networks for data analysis).
- somoclu Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters, has python API.
- HDBScan - implementation of the hdbscan algorithm in Python - used for clustering
- visualize_ML - A python package for data exploration and data analysis. [Deprecated]
- scikit-plot - A visualization library for quick and easy generation of common plots in data analysis and machine learning.
- Bowtie - A dashboard library for interactive visualizations using flask socketio and react.
- lime - Lime is about explaining what machine learning classifiers (or models) are doing. It is able to explain any black box classifier, with two or more classes.
- PyCM - PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters
- Dash - A framework for creating analytical web applications built on top of Plotly.js, React, and Flask
- Lambdo - A workflow engine for solving machine learning problems by combining in one analysis pipeline (i) feature engineering and machine learning (ii) model training and prediction (iii) table population and column evaluation via user-defined (Python) functions.
- TensorWatch - Debugging and visualization tool for machine learning and data science. It extensively leverages Jupyter Notebook to show real-time visualizations of data in running processes such as machine learning training.
- dowel - A little logger for machine learning research. Output any object to the terminal, CSV, TensorBoard, text logs on disk, and more with just one call to
logger.log().
Misc Scripts / iPython Notebooks / Codebases
- Map/Reduce implementations of common ML algorithms: Jupyter notebooks that cover how to implement from scratch different ML algorithms (ordinary least squares, gradient descent, k-means, alternating least squares), using Python NumPy, and how to then make these implementations scalable using Map/Reduce and Spark.
- BioPy - Biologically-Inspired and Machine Learning Algorithms in Python. [Deprecated]
- SVM Explorer - Interactive SVM Explorer, using Dash and scikit-learn
- pattern_classification
- thinking stats 2
- hyperopt
- numpic
- 2012-paper-diginorm
- A gallery of interesting IPython notebooks
- ipython-notebooks
- data-science-ipython-notebooks - Continually updated Data Science Python Notebooks: Spark, Hadoop MapReduce, HDFS, AWS, Kaggle, scikit-learn, matplotlib, pandas, NumPy, SciPy, and various command lines.
- decision-weights
- Sarah Palin LDA - Topic Modeling the Sarah Palin emails.
- Diffusion Segmentation - A collection of image segmentation algorithms based on diffusion methods.
- Scipy Tutorials - SciPy tutorials. This is outdated, check out scipy-lecture-notes.
- Crab - A recommendation engine library for Python.
- BayesPy - Bayesian Inference Tools in Python.
- scikit-learn tutorials - Series of notebooks for learning scikit-learn.
- sentiment-analyzer - Tweets Sentiment Analyzer
- sentiment_classifier - Sentiment classifier using word sense disambiguation.
- group-lasso - Some experiments with the coordinate descent algorithm used in the (Sparse) Group Lasso model.
- jProcessing - Kanji / Hiragana / Katakana to Romaji Converter. Edict Dictionary & parallel sentences Search. Sentence Similarity between two JP Sentences. Sentiment Analysis of Japanese Text. Run Cabocha(ISO--8859-1 configured) in Python.
- mne-python-notebooks - IPython notebooks for EEG/MEG data processing using mne-python.
- Neon Course - IPython notebooks for a complete course around understanding Nervana's Neon.
- pandas cookbook - Recipes for using Python's pandas library.
- climin - Optimization library focused on machine learning, pythonic implementations of gradient descent, LBFGS, rmsprop, adadelta and others.
- Allen Downey’s Data Science Course - Code for Data Science at Olin College, Spring 2014.
- Allen Downey’s Think Bayes Code - Code repository for Think Bayes.
- Allen Downey’s Think Complexity Code - Code for Allen Downey's book Think Complexity.
- Allen Downey’s Think OS Code - Text and supporting code for Think OS: A Brief Introduction to Operating Systems.
- Python Programming for the Humanities - Course for Python programming for the Humanities, assuming no prior knowledge. Heavy focus on text processing / NLP.
- GreatCircle - Library for calculating great circle distance.
- Optunity examples - Examples demonstrating how to use Optunity in synergy with machine learning libraries.
- Dive into Machine Learning with Python Jupyter notebook and scikit-learn - "I learned Python by hacking first, and getting serious later. I wanted to do this with Machine Learning. If this is your style, join me in getting a bit ahead of yourself."
- TDB - TensorDebugger (TDB) is a visual debugger for deep learning. It features interactive, node-by-node debugging and visualization for TensorFlow.
- Suiron - Machine Learning for RC Cars.
- Introduction to machine learning with scikit-learn - IPython notebooks from Data School's video tutorials on scikit-learn.
- Practical XGBoost in Python - comprehensive online course about using XGBoost in Python.
- Introduction to Machine Learning with Python - Notebooks and code for the book "Introduction to Machine Learning with Python"
- Pydata book - Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
- Homemade Machine Learning - Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained
- Prodmodel - Build tool for data science pipelines.
- the-elements-of-statistical-learning - This repository contains Jupyter notebooks implementing the algorithms found in the book and summary of the textbook.
Neural Networks
- nn_builder - nn_builder is a python package that lets you build neural networks in 1 line
- NeuralTalk - NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.
- Neuron - Neuron is simple class for time series predictions. It's utilize LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neural networks learned with Gradient descent or LeLevenberg–Marquardt algorithm. =======
- NeuralTalk - NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences. [Deprecated]
- Neuron - Neuron is simple class for time series predictions. It's utilize LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neural networks learned with Gradient descent or LeLevenberg–Marquardt algorithm. [Deprecated]
- Data Driven Code - Very simple implementation of neural networks for dummies in python without using any libraries, with detailed comments.
- Machine Learning, Data Science and Deep Learning with Python - LiveVideo course that covers machine learning, Tensorflow, artificial intelligence, and neural networks.
- TResNet: High Performance GPU-Dedicated Architecture - TResNet models were designed and optimized to give the best speed-accuracy tradeoff out there on GPUs.
Kaggle Competition Source Code
- open-solution-home-credit -> source code and experiments results for Home Credit Default Risk.
- open-solution-googleai-object-detection -> source code and experiments results for Google AI Open Images - Object Detection Track.
- open-solution-salt-identification -> source code and experiments results for TGS Salt Identification Challenge.
- open-solution-ship-detection -> source code and experiments results for Airbus Ship Detection Challenge.
- open-solution-data-science-bowl-2018 -> source code and experiments results for 2018 Data Science Bowl.
- open-solution-value-prediction -> source code and experiments results for Santander Value Prediction Challenge.
- open-solution-toxic-comments -> source code for Toxic Comment Classification Challenge.
- wiki challenge - An implementation of Dell Zhang's solution to Wikipedia's Participation Challenge on Kaggle.
- kaggle insults - Kaggle Submission for "Detecting Insults in Social Commentary".
- kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge.
- kaggle-cifar - Code for the CIFAR-10 competition at Kaggle, uses cuda-convnet.
- kaggle-blackbox - Deep learning made easy.
- kaggle-accelerometer - Code for Accelerometer Biometric Competition at Kaggle.
- kaggle-advertised-salaries - Predicting job salaries from ads - a Kaggle competition.
- kaggle amazon - Amazon access control challenge.
- kaggle-bestbuy_big - Code for the Best Buy competition at Kaggle.
- kaggle-bestbuy_small
- Kaggle Dogs vs. Cats - Code for Kaggle Dogs vs. Cats competition.
- Kaggle Galaxy Challenge - Winning solution for the Galaxy Challenge on Kaggle.
- Kaggle Gender - A Kaggle competition: discriminate gender based on handwriting.
- Kaggle Merck - Merck challenge at Kaggle.
- Kaggle Stackoverflow - Predicting closed questions on Stack Overflow.
- kaggle_acquire-valued-shoppers-challenge - Code for the Kaggle acquire valued shoppers challenge.
- wine-quality - Predicting wine quality.
Reinforcement Learning
- DeepMind Lab - DeepMind Lab is a 3D learning environment based on id Software's Quake III Arena via ioquake3 and other open source software. Its primary purpose is to act as a testbed for research in artificial intelligence, especially deep reinforcement learning.
- Gym - OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.
- Serpent.AI - Serpent.AI is a game agent framework that allows you to turn any video game you own into a sandbox to develop AI and machine learning experiments. For both researchers and hobbyists.
- ViZDoom - ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular.
- Roboschool - Open-source software for robot simulation, integrated with OpenAI Gym.
- Retro - Retro Games in Gym
- SLM Lab - Modular Deep Reinforcement Learning framework in PyTorch.
- Coach - Reinforcement Learning Coach by Intel® AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
- garage - A toolkit for reproducible reinforcement learning research
- metaworld - An open source robotics benchmark for meta- and multi-task reinforcement learning
Text Processing
Libraries for parsing and manipulating plain texts.
- General
- chardet - Python 2/3 compatible character encoding detector.
- difflib - (Python standard library) Helpers for computing deltas.
- ftfy - Makes Unicode text less broken and more consistent automagically.
- fuzzywuzzy - Fuzzy String Matching.
- Levenshtein - Fast computation of Levenshtein distance and string similarity.
- pangu.py - Paranoid text spacing.
- pyfiglet - An implementation of figlet written in Python.
- pypinyin - Convert Chinese hanzi (漢字) to pinyin (拼音).
- textdistance - Compute distance between sequences with 30+ algorithms.
- unidecode - ASCII transliterations of Unicode text.
- Slugify
- awesome-slugify - A Python slugify library that can preserve unicode.
- python-slugify - A Python slugify library that translates unicode to ASCII.
- unicode-slugify - A slugifier that generates unicode slugs with Django as a dependency.
- Unique identifiers
- Parser
- ply - Implementation of lex and yacc parsing tools for Python.
- pygments - A generic syntax highlighter.
- pyparsing - A general purpose framework for generating parsers.
- python-nameparser - Parsing human names into their individual components.
- python-phonenumbers - Parsing, formatting, storing and validating international phone numbers.
- python-user-agents - Browser user agent parser.
- sqlparse - A non-validating SQL parser.
Third-party APIs
Libraries for accessing third party services APIs. Also see List of Python API Wrappers and Libraries.
- apache-libcloud - One Python library for all clouds.
- boto3 - Python interface to Amazon Web Services.
- django-wordpress - WordPress models and views for Django.
- facebook-sdk - Facebook Platform Python SDK.
- google-api-python-client - Google APIs Client Library for Python.
- gspread - Google Spreadsheets Python API.
- twython - A Python wrapper for the Twitter API.
URL Manipulation
Libraries for parsing URLs.
- furl - A small Python library that makes parsing and manipulating URLs easy.
- purl - A simple, immutable URL class with a clean API for interrogation and manipulation.
- pyshorteners - A pure Python URL shortening lib.
- webargs - A friendly library for parsing HTTP request arguments with built-in support for popular web frameworks.
Video
Libraries for manipulating video and GIFs.
- moviepy - A module for script-based movie editing with many formats, including animated GIFs.
- scikit-video - Video processing routines for SciPy.
WSGI Servers
WSGI-compatible web servers.
- bjoern - Asynchronous, very fast and written in C.
- gunicorn - Pre-forked, partly written in C.
- uWSGI - A project aims at developing a full stack for building hosting services, written in C.
- waitress - Multi-threaded, powers Pyramid.
- werkzeug - A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.
Web Asset Management
Tools for managing, compressing and minifying website assets.
- django-compressor - Compresses linked and inline JavaScript or CSS into a single cached file.
- django-pipeline - An asset packaging library for Django.
- django-storages - A collection of custom storage back ends for Django.
- fanstatic - Packages, optimizes, and serves static file dependencies as Python packages.
- fileconveyor - A daemon to detect and sync files to CDNs, S3 and FTP.
- flask-assets - Helps you integrate webassets into your Flask app.
- webassets - Bundles, optimizes, and manages unique cache-busting URLs for static resources.
Web Content Extracting
Libraries for extracting web contents.
- html2text - Convert HTML to Markdown-formatted text.
- lassie - Web Content Retrieval for Humans.
- micawber - A small library for extracting rich content from URLs.
- newspaper - News extraction, article extraction and content curation in Python.
- python-readability - Fast Python port of arc90's readability tool.
- requests-html - Pythonic HTML Parsing for Humans.
- sumy - A module for automatic summarization of text documents and HTML pages.
- textract - Extract text from any document, Word, PowerPoint, PDFs, etc.
- toapi - Every web site provides APIs.
Web Crawling
Libraries to automate web scraping.
- cola - A distributed crawling framework.
- feedparser - Universal feed parser.
- grab - Site scraping framework.
- MechanicalSoup - A Python library for automating interaction with websites.
- pyspider - A powerful spider system.
- robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.
- scrapy - A fast high-level screen scraping and web crawling framework.
- portia - Visual scraping for Scrapy.
Web Frameworks
Full stack web frameworks.
- Synchronous
- Asynchronous
WebSocket
Libraries for working with WebSocket.
- autobahn-python - WebSocket & WAMP for Python on Twisted and asyncio.
- crossbar - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).
- django-channels - Developer-friendly asynchrony for Django.
- django-socketio - WebSockets for Django.
- WebSocket-for-Python - WebSocket client and server library for Python 2 and 3 as well as PyPy.
Services
Online tools and APIs to simplify development.
Continuous Integration
Also see awesome-CIandCD.
- CircleCI - A CI service that can run very fast parallel testing.
- Travis CI - A popular CI service for your open source and private projects. (GitHub only)
- Vexor CI - A continuous integration tool for private apps with pay-per-minute billing model.
- Wercker - A Docker-based platform for building and deploying applications and microservices.
Code Quality
- Codacy - Automated Code Review to ship better code, faster.
- Codecov - Code coverage dashboard.
- CodeFactor - Automated Code Review for Git.
- Landscape - Hosted continuous Python code metrics.
- PEP 8 Speaks - GitHub integration to review code style.
References
- https://realpython.com/python-virtual-environments-a-primer/
- https://github.com/vinta/awesome-python/edit/master/README.md
- https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-programming-environment-on-ubuntu-18-04-quickstart
- https://jtemporal.com/requirements-txt/
- https://pip.pypa.io/en/stable/user_guide/
Author
- Bruno Aurélio Rôzza de Moura Campos
Copyright
This work by Bruno A. R. M. Campos is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




















.png)
