Add post about my Mastodon DB hickup
This commit is contained in:
parent
7d13a05f32
commit
4429b381d0
169
content/blog/mastodon-database-hickup.rst
Normal file
169
content/blog/mastodon-database-hickup.rst
Normal file
@ -0,0 +1,169 @@
|
||||
Mastodon database hickup
|
||||
########################
|
||||
|
||||
:date: 2020-04-23T11:17Z
|
||||
:category: blog
|
||||
:tags: postgresql,mastodon
|
||||
:url: blog/mastodon-database-hickup/
|
||||
:save_as: blog/mastodon/database-hickup/index.html
|
||||
:status: published
|
||||
:author: Gergely Polonkai
|
||||
|
||||
|
||||
After being online for almost a year, my `Mastodon instance <https://social.polonkai.eu>`_ is back
|
||||
up. However, it wasn’t as straightforward as i wanted it to be.
|
||||
|
||||
About a year ago my Mastodon server went down. I messed up my docker images big time and the
|
||||
then-new version 2.8.4 didn’t want to start up. I left it there, and as i had a pretty rough year
|
||||
(in the good sense) i didn’t have time to bring it up again.
|
||||
|
||||
One of the constant problems i had with this instance is the size of the attachment directory. It
|
||||
easily grew to an enormous size in no time, so i had to constantly clean things up. Plus, it took
|
||||
away the precious space away from my other projects. So my first move was moving all these files
|
||||
to a DigitalOcean Space; with the ``awscli`` Python package it was easy as pie. If you want to do
|
||||
it, too, i followed `this article
|
||||
<https://angristan.xyz/2018/05/moving-mastodon-media-files-to-wasabi-object-storage/>`_.
|
||||
|
||||
Next came the database.
|
||||
|
||||
I tried to load the database dump, created using ``pg_dump -f mastodon.sql mastodon``, with
|
||||
``psql -f mastodon.sql``. Everything went fine, all the records created, except this error
|
||||
message:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
psql:mastodon-dump.sql:1691661: ERROR: could not create unique index "index_tags_on_name"
|
||||
DETAIL: Key (name)=(opensource) is duplicated.
|
||||
|
||||
First i tried to ignore it, but running `rails db:migrate` failed when it wanted to manipulate
|
||||
this index.
|
||||
|
||||
So i issued this query:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
SELECT COUNT(*) AS count, name FROM tags GROUP BY name HAVING COUNT(*) > 1;
|
||||
|
||||
And the result was this:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
count | name
|
||||
2 | opensource
|
||||
2 | xmpp
|
||||
2 | fdroid
|
||||
2 | socialmedia
|
||||
2 | blogging
|
||||
2 | c64
|
||||
2 | blog
|
||||
|
||||
Then collected the IDs of these duplicate tag records using this query:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
SELECT id, name, created_at
|
||||
FROM tags
|
||||
WHERE name in ('opensource',
|
||||
'xmpp',
|
||||
'fdroid',
|
||||
'socialmedia',
|
||||
'blogging',
|
||||
'c64',
|
||||
'blog')
|
||||
ORDER BY name, created_at;
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
id | name | created_at
|
||||
-------+-------------+------------
|
||||
159 | blog | 2018-02-24 19:56:01.710702
|
||||
15941 | blog | 2019-04-03 22:05:56.438488
|
||||
158 | blogging | 2018-02-24 19:56:01.721354
|
||||
16006 | blogging | 2019-04-09 12:13:20.852976
|
||||
5441 | c64 | 2018-07-31 00:50:56.172468
|
||||
16036 | c64 | 2019-04-14 19:56:40.692197
|
||||
924 | fdroid | 2018-04-14 19:39:21.261817
|
||||
15947 | fdroid | 2019-04-04 19:10:50.190317
|
||||
237 | opensource | 2018-03-05 21:50:52.609723
|
||||
15929 | opensource | 2019-04-03 11:49:21.772961
|
||||
251 | socialmedia | 2018-03-06 23:02:33.573775
|
||||
16034 | socialmedia | 2019-04-14 19:07:37.081635
|
||||
519 | xmpp | 2018-03-28 16:02:05.023784
|
||||
15988 | xmpp | 2019-04-07 08:06:59.429965
|
||||
|
||||
I wanted to see the ``created_at`` field just to know when things got bad. As it turns out, it
|
||||
was around April 2019, so it might have happened with the 2.8.0 upgrade. Was it me, or some
|
||||
problem with the migrations remains a mistery.
|
||||
|
||||
I also wanted to know about all the tables referencing tags:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
SELECT
|
||||
tc.table_name, kcu.column_name
|
||||
FROM
|
||||
information_schema.table_constraints AS tc
|
||||
JOIN information_schema.key_column_usage AS kcu
|
||||
ON tc.constraint_name = kcu.constraint_name
|
||||
JOIN information_schema.constraint_column_usage AS ccu
|
||||
ON ccu.constraint_name = tc.constraint_name
|
||||
WHERE constraint_type = 'FOREIGN KEY' AND ccu.table_name = 'tags';
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
table_name | column_name
|
||||
-------------------+-------------
|
||||
statuses_tags | tag_id
|
||||
account_tag_stats | tag_id
|
||||
|
||||
And last, but not least, i had to massage the dump a bit. After creating a backup, i opened up
|
||||
the file in ViM and looked for the place where it was loading the data into the ``tags`` table.
|
||||
It looked like this:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
COPY public.tags (name, created_at, updated_at, id) FROM stdin;
|
||||
blogging 2018-02-24 19:56:01.710702 2018-02-24 19:56:01.710702 158
|
||||
blog 2018-02-24 19:56:01.721354 2018-02-24 19:56:01.721354 159
|
||||
opensource 2018-03-05 21:50:52.609723 2018-03-05 21:50:52.609723 237
|
||||
socialmedia 2018-03-06 23:02:33.573775 2018-03-06 23:02:33.573775 251
|
||||
xmpp 2018-03-28 16:02:05.023784 2018-03-28 16:02:05.023784 519
|
||||
fdroid 2018-04-14 19:39:21.261817 2018-04-14 19:39:21.261817 924
|
||||
c64 2018-07-31 00:50:56.172468 2018-07-31 00:50:56.172468 5441
|
||||
opensource 2019-04-03 11:49:21.772961 2019-04-03 11:49:21.772961 15929
|
||||
blog 2019-04-03 22:05:56.438488 2019-04-03 22:05:56.438488 15941
|
||||
fdroid 2019-04-04 19:10:50.190317 2019-04-04 19:10:50.190317 15947
|
||||
xmpp 2019-04-07 08:06:59.429965 2019-04-07 08:06:59.429965 15988
|
||||
blogging 2019-04-09 12:13:20.852976 2019-04-09 12:13:20.852976 16006
|
||||
socialmedia 2019-04-14 19:07:37.081635 2019-04-14 19:07:37.081635 16034
|
||||
c64 2019-04-14 19:56:40.692197 2019-04-14 19:56:40.692197 16036
|
||||
|
||||
For every tag, i removed the line with the latest date, and noted their IDs, then went on to the
|
||||
``statuses_tags`` table, which looked like this (i don’t put all rows here as it would take a lot
|
||||
of space):
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
COPY public.statuses_tags (status_id, tag_id) FROM stdin;
|
||||
101862091116861098 15929
|
||||
|
||||
Here, i had to look for the IDs noted before in the last column (ie. at the end of a line), and
|
||||
for every line i change the newer ID to the old ones (15929 became 237, and so on).
|
||||
|
||||
Finally, the ``account_tag_stats`` table:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
COPY public.account_tag_stats (id, tag_id, accounts_count, hidden, created_at, updated_at) FROM stdin;
|
||||
1 15929 0 f 2019-04-03 11:49:21.810564 2019-04-03 11:49:21.810564
|
||||
|
||||
Here, the ID i was looking for is in the second column, so i used the magical ViM regex
|
||||
``^[0-9]\+\t\(15929\|15941\)`` (but with all the other IDs in the parentheses, too), and again
|
||||
changed the new IDs to the old ones.
|
||||
|
||||
I still can’t believe i didn’t mess up the database too much, but it loaded just fine, and the
|
||||
``db:migrate`` command executet just fine.
|
||||
|
||||
I also had some problems with so nginx reverse proxying, but it all turned out well, and my
|
||||
instance is back online after a year. Come follow me at `@gergely@polonkai.eu
|
||||
<https://social.polonkai.eu/@gergely>`_!
|
Loading…
x
Reference in New Issue
Block a user