Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVMe SSD improves database scan times exponentially #9

Open
miguelcleon opened this issue Jul 30, 2019 · 10 comments
Open

NVMe SSD improves database scan times exponentially #9

miguelcleon opened this issue Jul 30, 2019 · 10 comments

Comments

@miguelcleon
Copy link
Member

I recently got a new desktop with an NVMe SSD and I'm getting huge performance increases in query times. A query for a time series with about 140,000 values takes about 11.6 seconds on a server while it takes 0.334 seconds on the desktop with the NMVe SSD. This appears to be due to a data throughput of about 2.7 GB/s vs 95 to 100 MB/s on the server. This results in a massive decrease in the amount of time it takes to scan the database for the requested data.

@emiliom
Copy link
Member

emiliom commented Jul 30, 2019

Thanks so much for sharing this, Miguel! I too have been musing lately about SSD's and relational databases, but I haven't done any actual, solid comparisons. I do struggle with some queries (and DELETE or TRUNCATE statements) that are taking forever on a PostgreSQL database (not ODM2) on a server with spinning disks vs on my laptop with an SSD. But it's not an apples-to-apples comparison, b/c the server is getting old and has older versions of everything, compared to my laptop.

@miguelcleon
Copy link
Member Author

And the NVMe SSDs are much faster then SATA SSDs. It seems like a significant development.
https://www.pcworld.com/article/2899351/everything-you-need-to-know-about-nvme.html

@emiliom
Copy link
Member

emiliom commented Jul 30, 2019

Thanks for the info and the article; very helpful. I wasn't aware of NVMe. I think I've been using only SATA SSDs, but not completely sure. Update: Looks like I do have NVMe on my Linux laptop from Nov. 2016 ("256 GB PCle M.2 SSD"). No wonder it feels so fast compared to my server!

@horsburgh
Copy link
Member

We've had issues with this in our Hydroinformatics class. I used to give students a SQL Dump file that loaded a bunch of observational data into an ODM 1.1.1 database in MySQL. On machines with an SSD, this data loading script would take about a minute to run, but students with spinning disks would take multiple hours to run the same script (and most would give up and send me an email to complain).

@aufdenkampe
Copy link
Member

That's all great to know. My late 2016 MacBook Pro has an NVMe SSD, and it was noticeably faster than by previous MacBook Pro with a SATA III SSD.

Is there a way to configure AWS instances with SSDs? (that is affordable?)

@emiliom
Copy link
Member

emiliom commented Jul 30, 2019

Is there a way to configure AWS instances with SSDs? (that is affordable?)

I assume there is, and that it's bound to be more expensive. Ask me again in a couple of months.

@miguelcleon
Copy link
Member Author

miguelcleon commented Jul 30, 2019

Yes, you can get AWS instances with NVMe SSDs and prices are pretty reasonable, maybe only a little higher. The cost of these SSDs have come down a lot.
Here is one link about some of them, this is from over a year ago.
https://aws.amazon.com/blogs/aws/ec2-instance-update-c5-instances-with-local-nvme-storage-c5d/

@miguelcleon
Copy link
Member Author

miguelcleon commented Jul 30, 2019

or better here https://aws.amazon.com/ec2/pricing/on-demand/

No SSD:
m5.xlarge | 4 | 16 | 16 GiB | EBS Only | $0.192 per Hour

150 GB NVMe SSD:

m5ad.xlarge | 4 | N/A | 16 GiB | 1 x 150 NVMe SSD | $0.206 per Hour

@aufdenkampe
Copy link
Member

Wow, those are reasonable prices.
Is 150 GB of storage enough for your instances?

I also found this under "Storage Optimized":
i3.xlarge | 4 | 13 | 30.5 GiB | 1 x 950 NVMe SSD | $0.312 per Hour

@miguelcleon
Copy link
Member Author

Yes, it would be enough. Yeah, it's possible those storage optimized instances would work better but I'm not sure. One reddit post I saw said you only get a portion of the NVMe bandwidth with smaller general purpose instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants