Etienne Bacher (LISER) – Handling (very) large data with Python and R
Etienne Bacher (LISER) – Handling (very) large data with Python and R
Room : P61
Data used in social sciences grows increasingly large as we gain access to full censuses, genomics data, mobile phone data, and many other sources. Most of us are used to handling data going from a few hundred to a few million observations, but we can quickly find ourselves limited by the computing power of our laptops. An intuitive solution is therefore to use computers or servers with more RAM. However, several tools are available and could be used before turning to bigger computers. In this talk, I will present “polars”, an extremely fast Python library that is also available as an R package. This will also be the occasion to discuss other tools (e.g, DuckDB) and general practices when handling large data.
