Carmen Wong

Data Analyst skilled in SQL, Python and Power BI.

I was graduated from top-ranked universities in Hong Kong, with Master of Science in Financial Analysis and Bachelor of Engineering (Hons) in Systems Engineering & Engineering Management. I developed my career path as a research analyst in the banking and finance industry, specialized in analyzing business and financial data of listed companies and publishing research reports.

Proficiency

My mixed background in computer science, statistics and financial analysis is a plus in data analytics.

  • Data Analytics Tools
    (Microsoft Power BI, Excel)
  • Programming (SQL, Python, DAX)
  • Statistical Analysis (e.g. Time Series Analysis, Regression Analysis)
  • Financial Analysis

Projects

The data analystics projects are comprised primarily of two sections — financial analysis and demographic analysis. The first part concerns about financial analysis of global retail stocks, in addition to time series analysis of Tesco (TSCO. L) share prices. The second part is about demographic analysis for each local authority in England. In particular, there is correlation and regression analysis on the Key Stage 4 results which involves model development and evaulation. Likewise, academic performance of schools is presented in Excel dashboard as an alternative way of data visualization.

Time Series Analysis (ARIMA model)

Python scripts are used to analyze and forecast the stock prices of TESCO (TSCO. L) using Autoregressive Integrated Moving Average (ARIMA) model, which is an integral part of technical analysis of stock.

Main steps are as follows: 1) Download TESCO stock prices from Yahoo Finance. 2) Check for randomness in lag plot. 3) Test stationarity (Augmented Dickey-Fuller Test) and carry out data transformation by differencing. 4) Interpret lag orders from Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots. 5) Split the data into training and testing data. 6) Search for the optimal parameters for the ARIMA model. 7) Fit the model with the training data. 8) Forecast the stock prices with a 95% confidence interval. 9) Model evaluation by Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE).

#Python #Jupyter Notebook #Time Series Analysis

Financial Analysis of Stocks (Power BI)

This report covers the fundamental analysis of supermarket stocks (Sainsbury and Tesco) of London Stock Exchange and the comparison to their global peers such as Walmart (WMT) and Carrefour (CA.PA). Stock prices and financial data can have real-time updates just by refreshing the report. By comparing different financial metrics including valuations, profitability, growth and financial position of the stocks, stakeholders and investors can make informed investment decision from the analysis.

Please click here for online demonstration.

#Python #Power BI #Financial Analysis

Data Extraction with SQL

Here comes the start of the second part of my analysis. Firstly, data is extrcated from the .csv files downloaded from the GOV.UK website, which are three separate tables concerning individual school information, Key Stage 2 (KS2) results and Key Stage 4 (KS4) results of the schools. Data cleaning processes include i) dropping unrelated rows, ii) merging tables, iii) handling null values, iv) checking duplicates, etc.

After executing the SQL code in Microsoft SQL Server, a new and cleaned table is generated and is ready for use in subsequent data analysis — Demographic Analysis and School Performance Dashbord .

#SQL #Microsoft SQL Sever

Demographic Analysis (Power BI)

After importing the cleaned dataset extracted from MS SQL earlier on as well as other datasets from the Office for National Statistics, this report is targeted specifically at new immigrants to find their best place to live in England, in respect of i) safety, ii) housing prices, iii) population and income deprivation and iv) school performance. At the end of the report, a final score will be calculated for each selected local authority, and users can spot the ideal place to live in their selections.

Please click here for online demonstration.

#Python #Power BI

Regression Analysis

Correlation analysis is used to find the relationship amongst the variables which affect the results of KS4 of the schools in England. In light of the relatively high correlations between i) Ofsted ratings and ii) the proportion of disadvantaged students (explanatory variables) and KS4 results (target variable), multiple linear regression model is developed. To show how well the model fits the data, it is evaluated by comparing distribution plots between the actual and fitted data as well as calculating the score of the model. Data wrangling is performed prior to the model development, including the introduction of dummy variables for Ofsted ratings.

#Python #Jupyter Notebook #Multiple Linear Regression

Excel Dashbord for School Performance

Here is a dashboard of school performance in England, which shows different types of schools within a region(s), top 5 local authorities (LAs) with the best KS2 and KS4 results, top 5 LAs with the highest number of good and outstanding schools, etc. Users can refine the results by narrowing down the region names, school type, school level as well as the gender of the schools.

#Microsft Excel

Contact Me

If you would like to know more about my profile, please feel free to drop me a message here. Thank you.