Python

Introducing Python Package --- gower

Today I am so pleased to introduce my first PyPI package (so much easier to submit comparing to CRAN) — gower for calculating gower distance. The core function is originally published by Marcelo Beckmann. There are lots of packages in R that incorporated this method but unfortunately not for Python users. I took this chance to try the whole package-making experience for PyPI and here we go! What is gower distance?

Authenticating AWS (Signature V4) in R using Python Backend

Intuition I was working with an Elasticsearch project on AWS using Python and the requests_aws4auth package worked like a charm for me. Never had any issue regarding the authentication (AWS V4 could be hard to work with sometimes). However, when I trying to create a Shiny app for my project, the problem emerged. I just couldn’t get the V4 auth to work with httr in R. I tried aws.signature package on Github but keep getting request header issues.

Preprocess Text in Python --- A Cleaner and Faster Approach

Motivation Well, I think it all start with one of my favorite tweets from 2013: In Data Science, 80% of time spent prepare data, 20% of time spent complain about need for prepare data. — Big Data Borat (@BigDataBorat) February 27, 2013 When building NLP models, pre-processing your data is extremely important. For example, different stopwords removal, stemming and lemmization might have huge impact on the accuracy of your models.