Used cars marketplace – Data Analysis – Hungary

In this data analysis I will look through the hungarian used car market. In the beginning I’m scanning the third-party data, How it’s collected, What’s inside the data, and What type of visualization and correlations I can find.

After the exploration, the data cleaning begins, check if there’s some missing value, errors or outsiders inside the data.

For the visualization I will use Tableau Public, which also help me find correlation inside the data. What tendencies I can find, What are the main influencing factors, for the questions.

The Data

The data comes from a third-party source, kaggle.com. It’s based on one of the biggest used car marketplace in Hungary. The data is from a clean source, the data collecting method is transparent, as well as the .csv files which stores the datas.

The collected advertisements is from 2007 to 2020.05. Mainly from 2019.01.01 to 2020.05.06.

Also we can find the whole data collection method on the github, where we can verify the python code too. The different factors for the advertisements, for example, the models and brands, are stored in a differnet csv file, then the original. I will use the Tableau Data Source option to make connections between them, but for later purposes I’ve put together all the csv files into one xlsx.

I would like to use the color factor. Right now it’s an integer based value, not a name, but it is obvious that the person used the advanced searched values for that. But this factor hasn’t got a csv file, so it has been created.

Questions

I will search the answers for the following questions.

How the diesel and petrol type of cars prices is falling?

In the following visualization I’m analysing the petrol and diesel types of cars difference between Manufacturer’s Suggested Retail Price (in the following just MSRP) and advertisers price, on yearly and monthly breakdown. With this type of analysis we can clearly see how much the car has lost its values. We speak about avarage datas, where some of the important factor, like kilometer, brand, and type is not included.

In spite of that, we should see a clear difference, between petrol and diesel. Altough the petrol type of car (blue line), have bigger protrusion, yet the diesel type of cars mostly reach the 5 million marks. After all, we can say that the diesel cars can lost their value, more than 5 million Ft, over time.

How the sales prices develop in 2018-2020?

Is there any tendency, increasing or decreasing, like seasonal drop, in the cars sales prices?

We can see that over time, that platforms advertisement numbers is increasing, so the car sales prices.

Which city published the most advertisement, and where they sell the most?

In the map, we can see where are the most adverticement placed, and where are selled the most. It’s includes, Kecskemet, Fot, and Szekesfehervar. Looking deeply we can observe, there are some significance increase in the outer cities, like Szombathely, Zalaegerszeg, Szeged, Bekescsaba and Nyiregyhaza. There are some outer small village as well, where are the same increase noticeable.

Furthermore we can support the fact, that in the denser cities, there are more advertisements.

What correlataion we can find between the cars engine, and value?

In this question mainly I’m searching for a fact, how many cylinders, what type of cars deacrease the less over time, based on their MSRP.

For that type of question I will use the Microsoft data analysis software, the Power BI, because the decision tree, and the main influence factors.

The Microsofts software make as easier as the Tableau to make connections, between the tables. But now I’ve created a new column, where I extracted the MSRP and the Ad Price to have a value that shows the difference. If the value is positive, then the car lost it’s value, but it can happen that the car gain, or increased the value over time. That’s when the column show negative numbers.

The below picture made based on this column, with the help of a decision tree. Other main factors added, power, ccm, cylinder layout and number of cylinders.

At the first sight, there is a 626 horsepower, 4395 CCM, V-8 car, that increased it’s value more than 6 million forint. After adding a tooltip, that based on catalog, it showed that this is a 2019, BMW X6, M Competition edition.

In the further examination I can say, that most of the time, the premium SUVs, and higher categories cars that increased its price.

For the answer, why it is happend, we can’t find it in the current database, but with a qualitative analysis, even with a text-analysis based on the ad description, searching for words like “upgraded”, “new”, or “tuning” breaked into boolean values it can be measured.

How much the highlight influences the car sale prices and time?

On the platform you have a choice to highlight your advertisement, which can boost it even to the first page of the result. How much influences it has?

The visualization shows the avarage ad oldness, the avarage ad price, based on the highlight and sold values. The graph indicates that the cars, which is highlighted, sold less then 30 days than which is not highlighted. Furthermore, the highlighted and sold cars, potentionally have higher prices, more than 2 million forint than the not highlighted pieces. We can see a bigger difference between in the the not sold column, where this is almost reaches the 4 million forint.

How much percent of the cars published by prosellers in 2018-2020?

In large numbers of cars published by prosellers. These prosellers how much more cars publish than an avarage person? How much traffic generated from the prosellers, than the avarage people?

On the below diagram, we can see definitely, that more than 70% ads uploaded by prosellers. The same percentage applies to the highlights, but nearly 75% of the highlights bought by prosellers.

Which car colors is selling the most and the quickest?

This question is the one that everyone needs to know!

The present graph shows, which car colors mostly published on the website. We can see that the, buttery, white, and silver the three, which is the outstanding. But we can identify also, that the terrain, blue and red car is significant too. However these colors, are the most published ones, not sure, that these ones that sold in most of the time. The columns breaks down into two bar, which indicates the percentage of sold or not value. The tawny ones sold average 43% of the time, while the green ones percentage is 33%.

One important information, that we can find on the diagram, is the ad oldness based on the color. Obviously the violet, and ocher colors is staying more time on the website, this type of cars stays more than 200 days before they selling them.

What’s the correlation between the cars prices and kilometer

We can reply that while the car kilometer is increasing, the cars price is decreasing. But, are we sure? Now we put an end to it.

The scatterplot below demonstrate, that this statement is true, if a car has a low kilometer, it is more expensive. We should include one more important factor, that the production date of the car. The scatterplot color shows this, if the color is bright, then the car is older, if the color is darker, then the car is younger. However we can not see a such significance as the car price-kilometer correlation.

There were some outlier in the kilometer column, nevertheless it was maximized at 1 million kilometer.

Which car brand is the outstanding on the hungarian market?

In the first place, we have Opel, which published most of the time between 2018 and 2020.05. Followed by Ford, and Volkswagen. In the fourth place, sits the BMW, after that the Mercedes-Benz. However if we looking for the most sold cars, the first place is for Daewoo with 44%, the Suzuki 39,78%.

Which car brand retain its value the most?

The next linechart, where the orange is the avarage MSRP, the blue is the avarage ad price. If the two line is closer to each other that means, that the brand is not losted much of its original value. Dividing the two value (MSRP and ad price) and multiply by 100, we get a percentage, that shows the car current value based on MSRP. Based on that, Dacia, Volvo, Nissan, Mazda, Toyota, and Ford, prices reaches more than 50%. So they are retaining their value.

Of course, there are many other aspect, that we should explore, like the production year, edition, motor type, not to mention the inflation of the forint.

Finally, if we would like to buy a Chrysler, an Alfa Romeo, a Land Rover, or a Porsche, over the time, their prices can be as low as 40% of their MSRP.

Would you like a same analysis, but about your business? You can write me at the contact page.

Leave a Reply

Your email address will not be published.

Scroll to Top