1

I have a failing hard drive and after first using Testdisk which was going on very slow, I am using ddrescue to copy the disk. The original disk is a 2TB drive and had 4 NTFS partitions (which I had mistakenly formatted with ext4 on live ubuntu system).

Now, I used ddrescue on live ubuntu with sudo ddrescue -d -f -r3 /dev/sdb /dev/sdc sdc.log

I used a new 4TB hard drive for copying, now I want to make a second copy. And for the second copy, I have another new 4TB drive which is different from the first copy drive. I read that the target drive must be same size or at least bigger. Now my intermediate drive is bigger than the first failing drive (2TB). What if my second target drive is few MBs smaller than the intermediate drive? Will ddrescue fail to write the data on the second drive? Or will it stop writing after the original 2TB data ends on the first copy drive?

What shall I do in either case?

Thanks in advance.

UPDATE 21Feb2023: Below is the log file text (I copied the sdc.log file on a usb and opened it separately):

# Mapfile. Created by GNU ddrescue version 1.26
# Command line: ddrescue -d -f -r3 /dev/sdb /dev/sdc sdc.log
# Start time:   2023-02-18 05:03:10
# Current time: 2023-02-21 10:47:14
# Copying non-tried blocks... Pass 1 (forwards)
# current_pos  current_status  current_pass
0x610FA80000     ?               1
#      pos        size  status
0x00000000  0x47AC630000  +
0x47AC630000  0x00010000  *
0x47AC640000  0x01320000  ?
0x47AD960000  0x479140000  +
0x4C26AA0000  0x00010000  *
0x4C26AB0000  0x01320000  ?
0x4C27DD0000  0x2FA860000  +
0x4F22630000  0x00010000  *
0x4F22640000  0x01320000  ?
0x4F23960000  0x1F7230000  +
0x511AB90000  0x00010000  *
0x511ABA0000  0x01320000  ?
0x511BEC0000  0x32780000  +
0x514E640000  0x00010000  *
0x514E650000  0x01320000  ?
0x514F970000  0xC5D00000  +
0x5215670000  0x00010000  *
----------------------------------------------------

Here is the display in enter image description herethe terminal:

Secondly, when I tried to run smarttools with smartctl -a /dev/sdb >myreport"


Here is the report from SMART tools: (ran with "sudo smartctl -a /dev/sdb >myreport")

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.19.0-21-generic] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.14 (AF) Device Model: ST2000DM001-1ER164 Serial Number: W4Z3P9PN LU WWN Device Id: 5 000c50 09b8366dd Firmware Version: CC26 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database 7.3/5319 ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Feb 21 11:10:57 2023 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled

=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes.

General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 80) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 209) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 110 080 006 Pre-fail Always - 215372854 3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 096 096 020 Old_age Always - 4686 5 Reallocated_Sector_Ct 0x0033 079 079 010 Pre-fail Always - 26840 7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 168010836 9 Power_On_Hours 0x0032 064 064 000 Old_age Always - 32234 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 096 096 020 Old_age Always - 4690 183 Runtime_Bad_Block 0x0032 091 091 000 Old_age Always - 9 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 3934 188 Command_Timeout 0x0032 100 052 000 Old_age Always - 112 301 483 189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2 190 Airflow_Temperature_Cel 0x0022 062 034 045 Old_age Always In_the_past 38 (Min/Max 26/41 #1798) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 191 193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 224369 194 Temperature_Celsius 0x0022 038 066 000 Old_age Always - 38 (0 8 0 0 0) 197 Current_Pending_Sector 0x0012 001 001 000 Old_age Always - 37904 198 Offline_Uncorrectable 0x0010 001 001 000 Old_age Offline - 37904 199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 68 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 19597h+41m+10.668s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 57999941039 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1327164476473

SMART Error Log Version: 1 ATA Error Count: 3937 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 3937 occurred at disk power-on lifetime: 32223 hours (1342 days + 15 hours) When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were: ER ST SC SN CL CH DH


40 53 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 00 80 ff ff ff 4f 00 2d+19:53:57.102 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+19:53:57.101 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+19:53:57.101 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+19:53:57.101 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+19:53:57.100 READ FPDMA QUEUED

Error 3936 occurred at disk power-on lifetime: 32222 hours (1342 days + 14 hours) When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were: ER ST SC SN CL CH DH


40 53 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 00 80 ff ff ff 4f 00 2d+18:50:34.589 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:50:34.588 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:50:34.588 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:50:34.587 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:50:34.587 READ FPDMA QUEUED

Error 3935 occurred at disk power-on lifetime: 32221 hours (1342 days + 13 hours) When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were: ER ST SC SN CL CH DH


40 53 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 00 80 ff ff ff 4f 00 2d+18:04:18.861 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:04:18.861 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:04:18.861 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:04:18.860 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+18:04:18.860 READ FPDMA QUEUED

Error 3934 occurred at disk power-on lifetime: 32221 hours (1342 days + 13 hours) When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were: ER ST SC SN CL CH DH


40 53 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 00 80 ff ff ff 4f 00 2d+17:50:52.906 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+17:50:52.906 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+17:50:52.905 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+17:50:52.905 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+17:50:52.905 READ FPDMA QUEUED

Error 3933 occurred at disk power-on lifetime: 32219 hours (1342 days + 11 hours) When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were: ER ST SC SN CL CH DH


40 53 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 00 80 ff ff ff 4f 00 2d+15:56:40.578 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+15:56:40.016 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+15:56:39.728 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+15:56:36.988 READ FPDMA QUEUED 60 00 80 ff ff ff 4f 00 2d+15:56:36.988 READ FPDMA QUEUED

SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.


Irfan
  • 11
  • If you already made a copy, why not just... copy the copy? – Journeyman Geek Feb 20 '23 at 13:29
  • Yes, that's what I mean to do. Copy the copy. The first copy attempt with ddrescue is still running, it has been 2d 8 h and it has done 300 gb (15.53%) so far. When it finishes, I hope to make a second copy from the 1st copy. – Irfan Feb 20 '23 at 13:36
  • Ah! I assumed you'd set up ddrescue to do an image not a disk to disk copy. Would a file level copy suffice? – Journeyman Geek Feb 20 '23 at 13:47
  • ddrescue to an image file is much more useful. – harrymc Feb 20 '23 at 13:55
  • I have no idea at the moment, and doing this for the first time. My files (in original NTFS partition) are buried hidden under the ext4 partition (no data was written over, just partitioned and formatted once). One of the previous NTFS partitions (probably second partition) carries the photos which we want to recover. – Irfan Feb 20 '23 at 13:56
  • @Harrymc, disk manufacturers like your idea because it requires you to buy a disk that is bigger than the defective source. – r2d3 Feb 20 '23 at 15:13
  • "it has been 2d 8 h and it has done 300 gb" ... that averages about 1.5 MB/sec, which is incredibly low. Please make sure you're keeping an eye on dmesg and the disk's SMART data. It might also be worth looking into --skip-size and --reverse to see if skipping or approaching the problem area from the other side helps. – Attie Feb 20 '23 at 17:26
  • @Attie Thanks, and yes, the rate goes up and down but on average it is still slow. The output on screen in terminal does not indicate any bad areas (the number is still zero) but as I understand ddrescue skips bad sectors, so maybe that is why. Earlier, I have used TestDisk on my first try, and the process ran for over a week and covered only 30% so I stopped it. Compared to that, ddrescue looks faster to me. I will post the SMART data etc when I get home. – Irfan Feb 21 '23 at 06:49
  • @r2d3 well yes and no. You can still use the 'rest' of the disk with an image, and once recovery is done compress it to save space for copies, – Journeyman Geek Feb 21 '23 at 11:32
  • @JourneymanGeek Please tell me how can I rearrange all that text, just as you did. – Irfan Feb 21 '23 at 12:41
  • There's a few ways to - but I used 'code fences' - basically ``` before and after each code block. – Journeyman Geek Feb 21 '23 at 12:54
  • Please have a look at the logfile update. I am anxious if it is going ok. – Irfan Feb 23 '23 at 11:36
  • Please do not change the scope of the question after you got answers. Your original questions have been answered. "How is it going so far?" is a new distinct question. The site is not an interactive support service where threads may evolve. – Kamil Maciorowski Feb 23 '23 at 11:39
  • ok, Sorry, I am new to these forums. – Irfan Feb 23 '23 at 11:44
  • No harm done. If you need help beyond the original scope, you can ask a new question; but "how is it going so far?" is not a good question. In general try to make your questions useful for future users with similar problems. The question above is fine in this matter; "how is it going?" is not. A specific concern like "is it normal that ddrescue takes so much time?" may be, but note that similar questions already exist, check them first. – Kamil Maciorowski Feb 23 '23 at 12:07

2 Answers2

2

What if my second target drive is few MBs smaller than the intermediate drive? Will ddrescue fail to write the data on the second drive?

It will, but only when it tries to write beyond the size of the second drive. Normally the first pass is done in the forward direction, so if there are no read errors then at the moment of the write error the whole second drive will have been rewritten.

Or will it stop writing after the original 2TB data ends on the first copy drive?

It will not. As far as ddrescue is concerned, on the first copy drive there will be no indication where the copy ends. You can treat each drive as a linear sequence of bytes; each sequence has its own length, it's the size of the respective device. Copying (a part of) one sequence to (a part of) another sequence does not change the length of the latter. Drives are not like regular files in this matter, you cannot truncate them easily.

What shall I do in either case?

Something else in the first place. Copy only the part you need. E.g. you can use -s when copying with ddrescue from the intermediate drive to the second drive:

-s bytes
--size=bytes
Maximum size of the rescue domain in bytes. It limits the amount of input data to be copied. […] If ddrescue can't determine the size of the input file, you may need to specify it with this option. […]

(source)

You should specify the exact size of the original drive or a larger number. If there is any doubt, use a larger number. If you use a larger number then ddrescue will try to copy some garbage beyond the 2TB data. The point is you want this garbage to be reasonably small. Copying without -s will get almost all the garbage from the intermediate drive, totally in vain.

Even if you provide the exact number, then after ddrescue finishes, the second drive will contain its own garbage beyond the 2TB data, because also in it there will be no indication where the copy ends.

Hopefully there will be no read errors at this stage. If so then you should treat the mapfile from the first stage (i.e. your sdc.log) as relevant for both copies.

If the original drive uses GPT then please see this answer to learn what to do to fix GPT on a copy, in case you choose to do this.


For future reference: in similar cases consider creating a filesystem and use ddrescue to write to a regular file inside the filesystem. If you did this for the intermediate drive and for the second drive then copying the copy would mean copying the regular file from one filesystem to the other; you could do this with cp, without worrying about sizes at all.

It's not too late to create a filesystem on the second drive and to copy from the intermediate drive to a regular file there. You will still need to use -s though, because on the intermediate drive there will be no indication where the copy ends.

Personally I prefer ddrescue-ing to a regular file in a filesystem that supports CoW (e.g. Btrfs). Then I can make a non-CoW copy (cp --reflink=never) for redundancy (to another filesystem if I want or need) and any number of CoW copies within the filesystem (cp --reflink=always). Among these CoW copies I treat one as immutable (chattr +i) and work with others, possibly with tools that modify data. This way, if anything goes wrong with modifications, I can always create a new CoW copy of the immutable one without straining the disk(s) and virtually immediately.

(Side note: Btrfs supports compression and few times I have successfully managed to store an image of a disk as a regular file on a smaller disk, and to work with it.)

The downside of regular files is you need some knowledge and tools to get to partitions stored within. I mean if your copy is e.g. on /dev/sdz then the OS will create sdz1, sdz2 etc. automatically when the disk is connected or upon partprobe (unless logical sector sizes don't match between the original and the copy, see this question and my answer there to see what the problem is); but if your copy is in a regular_file then you won't get regular_file1, regular_file2 etc. as partitions. Useful tools:

  • mount -o offset=… …
  • losetup -o … --sizelimit … …
  • kpartx …

Another downside is you cannot boot from a regular file. If you plan to try to boot from a copy then copying directly to a block device is way more reasonable.

  • Thanks a lot for elaborating. I understood the first part to some extent, but the second part just went above my head totally. I do need to learn a lot, but it is fun doing it myself. Apparently, I would have another 10 to 12 days (while it is still busy with making the first copy) to search through and use proper syntax while making the second drive from the first copy. – Irfan Feb 21 '23 at 06:46
1

What if my second target drive is few MBs smaller than the intermediate drive?

It doesn't matter in your case as the defective source is 2TB and your sector by sector copy fits completely on your first 4 TB disk. If the second 4 TB disk is a little bit (a few MBs) smaller than your first one, your defective source will nevertheless be completely contained on your second duplicate.

By the way, the software is called Testdisk, not "test disk". When writing important keywords incorrectly you diminish the search abilities in this forum. I have corrected your posting.

Or will it stop writing after the original 2TB data ends on the first copy drive?

No. ddrescue does not know how you created your first copy. It does not try to interpret the data on the first copy. ddrescue will copy all 4 TB.

r2d3
  • 3,554