Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check quickly that we have Fedora copr-backend backup #3390

Open
praiskup opened this issue Aug 28, 2024 · 13 comments
Open

Check quickly that we have Fedora copr-backend backup #3390

praiskup opened this issue Aug 28, 2024 · 13 comments
Assignees

Comments

@praiskup
Copy link
Member

praiskup commented Aug 28, 2024

The backups should be on storinator box.

@praiskup
Copy link
Member Author

We need a howto document (output from this ticket).

@praiskup praiskup self-assigned this Sep 2, 2024
@praiskup
Copy link
Member Author

# for i in $(ls -1 /var/log/cron-*.xz | tac); do xzcat $i | grep rsnapshot; done
Sep 17 21:06:26 copr-be CROND[1470129]: (copr) CMDOUT (rsnapshot encountered an error! The program was invoked with these options:)
Sep 17 21:06:26 copr-be CROND[1470129]: (copr) CMDOUT (/bin/rsnapshot -c /srv/nfs/copr-be/copr-be-copr-user/rsnapshot.conf push )
Sep 17 21:06:26 copr-be CROND[1470129]: (copr) CMDOUT (ERROR: Could not write lockfile /srv/nfs/copr-be/copr-be-copr-user/rsnapshot.pid: No space left on device)
Sep 17 21:06:27 copr-be CROND[1470129]: (copr) CMDOUT (  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 58, in <module>)
Sep 17 21:06:27 copr-be CROND[1470129]: (copr) CMDOUT (  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 51, in _main)
Sep 17 21:06:27 copr-be CROND[1470129]: (copr) CMDOUT (  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 42, in rotate)
Sep 17 21:06:27 copr-be CROND[1470129]: (copr) CMDOUT (subprocess.CalledProcessError: Command '['/bin/rsnapshot', '-c', '/srv/nfs/copr-be/copr-be-copr-user/rsnapshot.conf', 'push']' returned non-zero exit status 1.)
Sep 17 21:06:28 copr-be CROND[1470129]: (copr) CMDEND (ionice --class=idle /usr/local/bin/rsnapshot_copr_backend >/dev/null)
Sep 14 01:01:02 copr-be CROND[1470229]: (copr) CMD (ionice --class=idle /usr/local/bin/rsnapshot_copr_backend >/dev/null)

@praiskup
Copy link
Member Author

$ lvresize /dev/VG_nfs/copr-be -L +8TB
$ xfs_growfs /srv/nfs/copr-be/
$ df -h /srv/nfs/copr-be/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VG_nfs-copr--be 48T 40T 8.1T 84% /srv/nfs/copr-be

@praiskup
Copy link
Member Author

Running ionice --class=idle /usr/local/bin/rsnapshot_copr_backend manually.

@praiskup
Copy link
Member Author

Still doing the rsync :-( and we seem to run out of space again:
/dev/mapper/VG_nfs-copr--be 48T 47T 1.7T 97% /srv/nfs/copr-be

@praiskup
Copy link
Member Author

I would remove the old increments, but that would probably break the current rsnapshot process. I'll keep the sync going for now, and wait for the potential failure (if it really fails, I'll remove old increments, and then restart rsnapshot).

@praiskup
Copy link
Member Author

Ok, going with /bin/rm -rf push.3 push.2 push.1 push.0 first, keeping the last .sync

@praiskup
Copy link
Member Author

praiskup commented Oct 5, 2024

[copr@copr-be ~][PROD]$ ionice --class=idle /usr/local/bin/rsnapshot_copr_backend
Warning: Permanently added 'storinator01.rdu-cc.fedoraproject.org' (ED25519) to the list of known hosts.
building file list ... 
rsync: [sender] opendir "/var/lib/copr/public_html/archive/issues/copr-3016" failed: Permission denied (13)
Timeout, server storinator01.rdu-cc.fedoraproject.org not responding.
rsync: [sender] write error: Broken pipe (32)
rsync error: unexplained error (code 255) at io.c(848) [sender=3.3.0]

@praiskup
Copy link
Member Author

praiskup commented Oct 7, 2024

 33,898,430,947   0%    1.08MB/s    8:17:17 (xfr#69052, to-chk=26461/64855270)
/var/lib/copr/public_html/temp/
/var/lib/copr/public_html/temp/issue-3067/
/var/lib/copr/public_html/usage-2019-08-04/
/var/lib/copr/public_html/usage4/
 33,898,430,947   0%    1.08MB/s    8:17:17 (xfr#69052, to-chk=0/64855270)    rsync: [receiver] stat "var/lib/copr/public_html/temp/issue-3067" (in push) failed: No such file or directory (2)
 33,898,430,947   0%    1.08MB/s    8:17:17 (xfr#69052, to-chk=0/64855270)----------------------------------------------------------------------------
rsnapshot encountered an error! The program was invoked with these options:

rsnapshot encountered an error! The program was invoked with these options:
/bin/rsnapshot -c /srv/nfs/copr-be/copr-be-copr-user/rsnapshot.conf push 
----------------------------------------------------------------------------
ERROR: Could not write lockfile /srv/nfs/copr-be/copr-be-copr-user/rsnapshot.pid: No space left on device
Traceback (most recent call last):
  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 58, in <module>
    _main()
  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 51, in _main
    rotate(database)
  File "/srv/nfs/copr-be/copr-be-copr-user/rsnapshot", line 42, in rotate
    subprocess.check_call(cmd)
  File "/usr/lib64/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/bin/rsnapshot', '-c', '/srv/nfs/copr-be/copr-be-copr-user/rsnapshot.conf', 'push']' returned non-zero exit status 1.


sent 34,368,856,397 bytes  received 217,861,583 bytes  272,708.92 bytes/sec
total size is 41,526,683,163,969  speedup is 1,200.65

@praiskup
Copy link
Member Author

praiskup commented Oct 7, 2024

Starting with: /dev/mapper/VG_nfs-copr--be 48T 345G 48T 1% /srv/nfs/copr-be

@praiskup
Copy link
Member Author

praiskup commented Oct 9, 2024

Hmmm

Timeout, server storinator01.rdu-cc.fedoraproject.org not responding.                                                  
rsync: [sender] write error: Broken pipe (32)                                                                          
rsync error: unexplained error (code 255) at io.c(848) [sender=3.3.0]                                                                                                                                                                         
                                                                                                                       
real    1038m41.824s                                                                                                                                                                                                                          
user    67m3.939s                                                                                                                                                                                                                             
sys     85m51.315s

Eventhough storinator's sshd:

● sshd.service - OpenSSH server daemon
     Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; preset: enabled)
     Active: active (running) since Sat 2024-10-05 22:14:15 UTC; 3 days ago

@praiskup
Copy link
Member Author

13,669,445,228,841  66%   31.94MB/s   58:19:50  Timeout, server storinator01.rdu-cc.fedoraproject.org not responding.

rsync: [sender] write error: Broken pipe (32)
rsync error: unexplained error (code 255) at io.c(848) [sender=3.3.0]

real    11032m7.826s
user    821m15.285s
sys     946m50.620s

@praiskup
Copy link
Member Author

# 5h
ServerAliveInterval 20
ServerAliveCountMax 900
ConnectTimeout 120

Before I tried with 20 / 5 / 60.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

1 participant