Data Hashing

Video Activity

Data Hashing This lesson covers data hashing. Data hashing Data hashing is a one-way mathematical algorithm which allows us to obtain a string of numbers like a fingerprint that cannot be reversed. Data hashing allows for file verification and helps with Public Key Infrastructure (PKI).

Join over 3 million cybersecurity professionals advancing their career
Sign up with

Already have an account? Sign In »

31 hours 29 minutes
Video Description

Data Hashing This lesson covers data hashing. Data hashing Data hashing is a one-way mathematical algorithm which allows us to obtain a string of numbers like a fingerprint that cannot be reversed. Data hashing allows for file verification and helps with Public Key Infrastructure (PKI).

Video Transcription
So the next topic that we're gonna talk about with encryption is our data hashing. So what is data hashing? Well, we already talked about encryption, and we talked a little bit about encryption, and we talked about asymmetric encryption and we talked about symmetric encryption. Well, data hashing is a one way mathematical algorithm
that allows us to get a string of numbers like a fingerprint
that we can't reverse. And it allows for file verification, and it helps us with P K. I helps us in our public key infrastructure.
So what is data hashing do?
So if we take a packet of data or we take a file and we want to download a file from a server Ah, lot of times we may see underneath that file a string of numbers and it says, Here's the hash for this file. Well, what do we do with that?
Well, what that means is that before the file was hosted on the server, so we have the file here.
And remember, all files, all fielders, all data communications, air, just a string of numbers.
So this string of numbers were going to say it was 1234567
Which is not what this string of numbers would be for our file.
And this is this is some of the numbers that make up our files we have. We have thousands and thousands of individual bits which make up our file. Well, all of those bits of data are put through an algorithm such as MD five or Shaw MD. Five. Message Digest five is one of our more
are one of the more popular but being outdated,
outdated methods put it through our hashing algorithm,
which is the example of 75 which is not an extremely good one.
And we put it through MD five and we get a fingerprint from the out of the other side. It's a one way algorithm. We can't get the file from the hash, but we can only get the hash from the file.
So we put the file into the algorithm. We put our long string of numbers into the algorithm, and then MD five gets gives us back another string of numbers that aren't as long as all the numbers in our data but are still a long string of numbers. So it gives us the very we're gonna show a short example here,
an MD five hash is actually very, very long number.
But it will give us a number, say, like
And that's our hash.
So after we download this file off the server and we look and we look on their known good website and they say the hash for this file is 246310
we take this known algorithm because Indy five is a known algorithm. We take the file that we just download it,
put it through the algorithm and then see if we get the same hash if we get the same hash. That means the file was not changed
if we get a different hash. Even if a single bit of data was changed in that file or changed in that program, this hash will be completely different.
So even if a single bit has changed, we may get a hash like
completely different ash.
there's no way that we can take this number or there's no way we can take the other hash that we knew was good and put it backwards through MD five and get our file and there's no. But there's only one way we can go. We can only take this hash, and we can only put it through one way or we can only take the file. But it went through one way to get our hash.
So our hash is like our hash is a fingerprint of a file, just like people that we say. Everyone's fingerprints are unique, just like my fingerprint is an identify her of me and you can't build a whole me from my fingerprint. You can only get a fingerprint from me, though. Ah, fingerprint is a verifier of a file,
so we get the files fingerprint,
and it lets us know that that file is who they say they are that it hasn't been changed.
Some hashing algorithms have things called collisions and collisions, our hash numbers that you can get the same hash from completely different files, and those are bad. We want to make sure that every single hash is individually unique. That's why NB five is being replaced by Shaw because Shaw
has less.
Indy five has known collisions. Shaw is a better because it does not,
um, has less, at least at this point in time. So
way would prefer Shaw over MD five, because MD five has known collisions.
So our data hashing is like a It's like a fingerprint, and it's great for final verification and P K I. When we on
servers that store our passwords securely store our passwords as password hash is, they do not store our actual password. When we type in our password and save it, the server saves the hash of our password so that whenever we enter our password into their server, they take the pat, whatever we entered in the box,
make a hash of it and then compare the stored hash
with the with the hash that they just got from us. You can't take the hash, put it in the password field and then go through because what the server is going to do is this gonna then hash the hash and you'll get a completely different number, So hashes are more are much, much more secure way of storing passwords
because if the
the password store server is compromised, the only thing anybody is going to get is a bunch of hashes, and you can't get the original password from the hash. You're just gonna get a bunch of random numbers, so it's a lot better. It's a lot more security. Its stores passwords a lot better,
and you don't have to. And you won't be as worried if your passwords server is compromised
because all it's doing is storing hashes. And that's not going to give Attackers actual password information unless they use those unless they try to crack those hashes
data hashing again. It's like a fingerprint. And how do we use it in P K I? How is how it is password hashing Help with R. P K I
well, with r P K I
when we're sending our data back and forth, What we made to be doing is we send a certificate, which is a,
uh, which is a hash that is encrypted with our private key.
say Alice wants to send Bob and an email message. But Alison Bob don't live in a don't live in a
in a sealed environment, Alison Bob don't live in a vacuum.
There's also Carol,
and Kill
wants to impersonate Alice's messages.
So Alice Alice sends Bob Message that's encrypted with Bob's Public Key,
so included with Bob's Public Key
and has
and is just being sent to Bob
and Bob receives it and decrypt it with his private key as we talked about a little bit earlier.
Well, if Kill intercepted this message that Alice is sending to Bob because it's encrypted with Bob's public key and she only has Bob's public key, she can't encrypted. She can't decrypt it so she can't decrypt what's in here.
But what she could, technically, dio is she could send a fake message to Bob
encrypted with his public key and say that it's from Alice
and that would be a mess because now bob and decrypted with his private key and thinks the messages from Alice.
So in order to prevent that, what we can do is we can introduce
data hashing.
Bob has a public key in a private key, and Alice also has a public and a private key.
She is gonna send Bob Data and encrypt it with his public key, but she is going to
she's gonna send Bob data,
and then she's going to sin, and she's gonna take a hash of this data packet. She's gonna put it through a one way algorithm and she'll get the hash. We're gonna say the hash of this data is
47231 That's the hash that she gets of this data.
She then takes this hash and encrypts it with her private key.
So this hash
it is now encrypted with her private key, and it's tacked onto the end of the data.
Take that hash and stick it onto the end of the data but encrypted with Alice's private key.
Then this whole packet data and the encrypted hash
can be sent to Bob.
So Bob is then going to use Alice's public key to decrypt. That hash at the end
is going to get the original hash and then is goingto also put her data
through ah, hashing algorithm and see if the hash he gets of the data matches up with the hash that she stuck on the end.
So that verifies for Bob
that he's getting the data that is actually meant to be sent from Alice, that it has not been modified in transit because if that data was modified during transit at all that it would then because the hash was put at the end, that hash would be different than the data when he put it through the hashing algorithm.
It would be different
then the hash that Alice sent him because she put one set of data through the hash in the gate for hashing algorithm already gave her Hatch number and then Carol if she tries to modify that data at all and when. But when Bob puts that data through a hashing algorithm, it'll give him a different number.
And Carol cannot modify this number at the end of the packet because this number at the end of the packet is encrypted with Carol's private key it it's encrypted with Alice's private key. So if Carol tries to spoof this, she can't because she doesn't have Alice's private key.
So, again, this is It's a
The more the deeper and deeper we get into our encryption a little bit more complicated, it gets. But data hashing is an important concept to understand, because it's very important not only in how we make sure that we how not only how we transfer files, but it's important in our P K I.
It's important in verifying that files are the same
from our center to a receiver. It's important. Tomato Verify that data was not changed in transit, and it's important in how are how are passwords should be securely stored on our network and on our servers.
Up Next
CompTIA Network+

This CompTIA Network+ certification training provides you with the knowledge to begin a career in network administration. This online course teaches the skills needed to create, configure, manage, and troubleshoot wireless and wired networks.

Instructed By