As capital markets evolve, data science is becoming more prominent to scale the operations. In this article, I will talk about how data science is currently used.

Photo by Sean Pollock on Unsplash

More businesses today are riding the Big Data analytics bandwagon with the objectives of converting insights — -gleaned from huge piles of data — into genuine business advantage. In the retail banking space, unstructured data collected from a broad range of social media sources has resulted in advanced customer profiling and in-depth analytics that in turn are helping enhance customer loyalty and experiences. However, in capital markets so far, firms have traditionally dealt with structured data sets from limited and pre-defined sources. Big data strategies have now begun to impact a select few areas in capital markets firms over the…


From the point of view of a woman in tech

My path to data science is similar to most data scientists.

Photo by Isaac Smith on Unsplash

I have a STEM degree and a master’s. I loved math and science growing up and still do. I enjoy learning up at facts about the world, which aligns with my other hobby, investing. All skills which I find our East Central in data science. Other name for data science is decision science, how how to make decisions based on data and forecasting predictions based on previous data. …


With rising debt and a continued quantitative easing environment, paired with low-interest rates, it is a ticking time bomb for an inflationary depression or the roaring 20s. Here is my take on what to come and how to deal with it.

Key takeaways

  • Inflation is a measure of the rate of rising prices of goods and services in an economy.
  • Currency depreciation is a fall in the value of a currency in terms of its exchange rate versus other currencies.
  • Inflationary depression is a period of severe economic depression characterized by high inflation rates.

Causes of Inflation

  • Cost-push inflation occurs when prices increase due to increases in production costs, such as raw materials and wages. The demand for goods is unchanged while the supply of goods declines due to the higher costs of production. …

Toronto’s real estate market is arguably in a bubble. The research paper defines what a housing market bubble is and what are its causes and effects on the economy as a whole. This study aims to analyze how to combat a bubble and what role policymakers must play to neutralize its effects. Also analyzing key history lessons learned from previous market crashes. Finally detailing how to hedge our portfolio to protect from the repercussions of the market bubble bursting.

Introduction: Real estate market bubbles

Housing bubbles usually start with an increase in demand, in the face of limited supply, which takes a relatively extended period…


Time is not a line, but a series of now-points. Taisen Deshimaru

By the end of this article, you will be able to:

  • Describe time series Analysis, its components, and when to use it.
  • Perform exploratory data analysis on time-series data.
  • Perform time Series decomposition.

Time Series

What is Time Series Analysis

It is a statistical technique that deals with time-series data, or trend analysis.

In a time series, time is often the independent variable and the goal is usually to make a forecast for the future to predict future values of a variable based on past data and build forecasting models.

A common example of this is trying to make stock market predictions. It is very challenging because…


Clustering

K-means Clustering

… Does this look familiar?

It kinda looks like slicing up variance AKA sums-of-squares.

Examples in R

www.r-blogger.com/exploring-assumptions-of-k-means-clusterin-using-r/

www.r-bloggers.com/k-means-clustering-is-not-a-free-lunch/

Assumptions:

  • clusters are spherical
  • clusters are similar sizes

Reporting:

  • We ran a k-means clustering using 2 clusters. Two clusters were chosen by the “elbow method”

Choosing a model


By the end of this article, you should be able to plan, apply, diagnose and report logistic regression models using R.

Scenario

A man was admitted into a hospital for stomach pain. An x-ray revealed a shadow of an eel in his stomach (he had neglected to mention that he had inserted an eel into his digestive tract in order to cure constipation). So the question is: can the treatment of inserting an eel into one’s digestive tract cure constipation?

Read in Data

Check Variables


Black swan points?

Outliers and influential points

Outliers and influential points can have a serious impact on your model but, as we’ve discussed, the definition of an “outlier” depends on the context. When looking at data overall, you will likely want to check for errors in data entry, etc. But when working with a specific model, you may have justification to remove some data to create a more accurate model about the typical cases…at the cost of some generalizability.

With regression models, we have a couple of ways to figure this out

Outliers

With regression models, residuals that are far away from 0 have increased influence when using…


What if you have confounding variables? Is it correlated because of another variable and how do we account for that?

Testing for Exam Anxiety

Scenario

A psychologist was interested in the effect of exam stress and revision (aka, studying) time for an exam.

She devised and validated a questionnaire to assess state anxiety relating to exams — the Exam Anxiety Questionnaire (EAQ), which produces a measure of anxiety out of 100.

Anxiety was measured before an exam, as was the score of each student, and the number of hours each student spent revising (studying). (from DSUR 4.5 and 6.5)

What do we do first?

Import the data

Let’s get the data

Testing correlation

You can use `cor.test()` to test for correlation


Getting more out of what was given or currently have

By the end of this article, you will be able to:

  • describe bootstrapping as a technique and explain how confidence intervals are computed using bootstrapping
  • calculate bootstrapped correlations and confidence intervals using R
    define a function in R
  • calculate, interpret, and report partial correlation using R

Non-parametric tests

Another data set

Last time, we saw another data set — a measure of how creative people are and their position (i.e., rank) in a “greatest liar” competition.

Anh Le

Data science and Blockchain Enthusiast | Chess Player

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store