DRDY ERR from HDD

All of a sudden I started getting the DRDY ERR with my laptop running linux..some messages look like..

 ata1:00: status: { DRDY ERR } ata1.00: error {UNC } ata1:00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 ata1:00: BMDMA stat 0x25 ata1:00: failed command: READ DMA

finally it drops me into a command prompt asking me to enter a run level and after that

 INIT: no more processes left in this runlevel

Suspecting a HDD crash, I took it out and used in another PC as an external USB HDD drive and I was able to mount & see all partitions and files within. So I assume Disc is OK.

[EDIT/UPDATE]

I'm also able to boot into the laptop from a USB pen drive with linux) and even able to see all the partitions on the disk and access them.

Also took out the HDD and put it in a external casing and tried booting into the same laptop but got following different errors

 end_request: critical target error, dev sda, sector 32839936 EXT4_fs error: (device sda5): ext4_find_entry:935: inode #393217: comm init: reading directory lblock 0 INIT: No inittab file found Enter runlevel:

So I guess, the HDD is accesible as a storage, but not bootable. The partitioning scheme on that HDD is as under if that'd help.. (GPT scheme)

 partition FileSystem size flags --------- ----------- ---- ----- /dev/sda1 unknown 2.00MB bios_grub /dev/sda2 ext2 128MB # was supposed to be common boot partition for chain loading /dev/sda3 swap 1.5GB /dev/sda4 ext4 8GB # Linux 1 (somehow, Grub does not show this in the menu, cannot boot into) /dev/sda5 ext4 8GB # Linux 2 (I could only boot into this one from Grub.) /dev/sda6 ext4 94GB # DATA unallocated _ 1MB

I have installed linuxes one after another and actually wanted to install Grub in /dev/sda2 and chainload Linux 1 & 2, but before I could do it, I hit this snag!

Any ideas? Solutions?

[UPDATE 2]

  • Title of the problem is no longer applicable *

I booted from USB and did a 'fsck' on all partitions. All (except /sda5) were reported clean. /dev/sda5 reported many errors (probably around a couple of hundred), I only kept entering 'y' for all the prompts. Inbetween there were messages like 'linking 'lost+found' ....' After a runnning 2 passes on all partitions, when I re-booted from the HDD, here is the latest error...

 INIT: verision 2.88 booting INIT: No inittab file file found Enter runlevel:

Does it look like I'd be able to get back the OS instance and boot?

5

1 Answer

The first error you reported:

ata1:00: status: { DRDY ERR }
ata1.00: error {UNC }
ata1:00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1:00: BMDMA stat 0x25
ata1:00: failed command: READ DMA

says that a READ DMA ATA command to a disk on ATA port 1 failed (status includes ERR for error). That port is most likely the hard disk, and the error points toward the drive having problems. The DMA part can likely be ignored; DMA is Direct Memory Access which is the dominant transfer mode these days, and if you were having RAM or RAM bus problems to the degree that you were hitting something like that repeatedly, you'd likely be seeing a ton more errors if the system was able to function at all.

The second error:

end_request: critical target error, dev sda, sector 32839936
EXT4_fs error: (device sda5): ext4_find_entry:935: inode #393217: comm init: reading directory lblock 0
INIT: No inittab file found

says there is some problem on /dev/sda, sector 32839936, which with 512-byte sectors puts us physically toward the end of the /dev/sda5 partition, which adds up with device sda5 as reported by the file system driver. The error reported by init together with the file system driver's error details points toward a problem with the file system causing /etc/inittab to be unavailable or (less likely) unreadable. This would mean that either the root directory, the /etc directory, or the /etc/inittab file entry are somehow involved in the corruption. Given the inode number, I'd take a shot at /etc/inittab specifically being the culprit, until proven wrong.

You write (my emphasis):

Suspecting a HDD crash, I took it out and used in another PC as an external USB HDD drive and I was able to mount & see all partitions and files within. So I assume Disc is OK.

I would say that your assumption is unfounded. The disk is obviously having some problem; with any luck, it'll be easy to fix.

The first thing I would do in your situation is to refresh my backup of everything that is on that disk. Make sure that you do not overwrite or delete anything from your most recent backup, as there is certainly a possibility that you will need it. Perhaps the best option is to make a fresh backup onto a new (or at least not previously used for your own backups) drive of everything that you are able to access. Expect some I/O errors on the source while making that copy.

Second comes attempting recovery. With any luck, given the errors, this is a single-sector or few-sectors problem which has caused a small amount of file system corruption, in which case e2fsck should be able to repair most of the damage. Some of your files are likely gone, but with some luck, you might be able to find them in /lost+found under the file system's mount root (meaning for example /data/lost+found if you mount /dev/sda5 on /data) after having e2fsck do what it can. Otherwise, do a comparison against your most recent backup from before the problems started, and restore relevant files from the backup. (Did I mention backups are useful if bad things ever happen, as they inevitably do?)

Third comes the question of whether you can trust the drive for future use. A few bad sectors doesn't have to be catastrophic from the drive's point of view, but rotational drives about 100 GB in size practically cannot be sourced new today in most form factors, which points to this being a relatively old drive. Personally, I'd probably just accept that the drive has outlived its useful life at this point and get a replacement, but then again I am rather paranoid when it comes to my data; your mileage may vary. You will have to weigh the cost of a replacement drive against the risk of total failure of the drive and subsequent total loss of all the data on the drive.

4

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like