Skip to content Skip to sidebar Skip to footer
Showing posts with the label Bigdata

R Foverlaps Equivalent In Python

I am trying to rewrite some R code in Python and cannot get past one particular bit of code. I'… Read more R Foverlaps Equivalent In Python

Sklearn-gmm On Large Datasets

I have a large data-set (I can't fit entire data on memory). I want to fit a GMM on this data s… Read more Sklearn-gmm On Large Datasets

Quickly Sampling Large Number Of Rows From Large Dataframes In Python

I have a very large dataframe (about 1.1M rows) and I am trying to sample it. I have a list of inde… Read more Quickly Sampling Large Number Of Rows From Large Dataframes In Python

Python - Parsing A Text Onto Columns By The Position Of Each Item

The Bovespa (brazilian stock exchange) offer a file with all the quotes in a timeframe. The file is… Read more Python - Parsing A Text Onto Columns By The Position Of Each Item

Correct Way Of Writing Two Floats Into A Regular Txt

I am running a big job, in cluster mode. However, I am only interested in two floats numbers, which… Read more Correct Way Of Writing Two Floats Into A Regular Txt

Quickly Sampling Large Number Of Rows From Large Dataframes In Python

I have a very large dataframe (about 1.1M rows) and I am trying to sample it. I have a list of inde… Read more Quickly Sampling Large Number Of Rows From Large Dataframes In Python

Get A List Of Subdirectories

I know I can do this: data = sc.textFile('/hadoop_foo/a') data.count() 240 data = sc.textFi… Read more Get A List Of Subdirectories

Correct Way Of Writing Two Floats Into A Regular Txt

I am running a big job, in cluster mode. However, I am only interested in two floats numbers, which… Read more Correct Way Of Writing Two Floats Into A Regular Txt