This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Database patch

Tags: None
(comma "," separated)
pégésql
Karma
0

Database patch

Thu Mar 30, 2006 10:09 pm
Dear all,

Following my last post, please find enclosed a first patch. This patch is for discussion, not applying right now, as it needs to be discussed. This is only my first day of programming. I need to learn more about Amarok internals.

Here is that patch:

[code:1]Index: collectiondb.cpp
===================================================================
--- collectiondb.cpp (révision 524611)
+++ collectiondb.cpp (copie de travail)
@@ -300,7 +300,7 @@

//create tag table
query( QString( \"CREATE %1 TABLE tags%2 (\"
- \"url \" + textColumnType() + \",\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"dir \" + textColumnType() + \",\"
\"createdate INTEGER,\"
\"modifydate INTEGER,\"
@@ -398,7 +398,7 @@

// create directory statistics table
query( QString( \"CREATE %1 TABLE directories%2 (\"
- \"dir \" + textColumnType() + \" UNIQUE,\"
+ \"dir \" + textColumnType() + \" PRIMARY KEY,\"
\"changedate INTEGER );\" )
.arg( temporary ? \"TEMPORARY\" : \"\" )
.arg( temporary ? \"_temp\" : \"\" ) );
@@ -406,11 +406,11 @@
//create indexes
query( QString( \"CREATE INDEX album_idx%1 ON album%2( name );\" )
.arg( temporary ? \"_temp\" : \"\" ).arg( temporary ? \"_temp\" : \"\" ) );
- query( QString( \"CREATE INDEX artist_idx%1 ON artist%2( name );\" )
+ query( QString( \"CREATE UNIQUE INDEX artist_idx%1 ON artist%2( name );\" )
.arg( temporary ? \"_temp\" : \"\" ).arg( temporary ? \"_temp\" : \"\" ) );
- query( QString( \"CREATE INDEX genre_idx%1 ON genre%2( name );\" )
+ query( QString( \"CREATE UNIQUE INDEX genre_idx%1 ON genre%2( name );\" )
.arg( temporary ? \"_temp\" : \"\" ).arg( temporary ? \"_temp\" : \"\" ) );
- query( QString( \"CREATE INDEX year_idx%1 ON year%2( name );\" )
+ query( QString( \"CREATE UNIQUE INDEX year_idx%1 ON year%2( name );\" )
.arg( temporary ? \"_temp\" : \"\" ).arg( temporary ? \"_temp\" : \"\" ) );

if ( !temporary )
@@ -434,7 +434,11 @@
void
CollectionDB::createIndices()
{
- query( \"CREATE INDEX url_tag ON tags( url );\" );
+ // <-- Remove
+ // Not needed, since the url is a primary key
+ // This works for PostgreSQL. Please check MySQL and SQLite
+ // query( \"CREATE INDEX url_tag ON tags( url );\" );
+ // Remove -->
query( \"CREATE INDEX album_tag ON tags( album );\" );
query( \"CREATE INDEX artist_tag ON tags( artist );\" );
query( \"CREATE INDEX genre_tag ON tags( genre );\" );
@@ -446,8 +450,12 @@

query( \"CREATE INDEX embed_url ON embed( url );\" );
query( \"CREATE INDEX embed_hash ON embed( hash );\" );
-
- query( \"CREATE INDEX directories_dir ON directories( dir );\" );
+
+ // <-- Remove
+ // Not needed, since the directory is a primary key
+ // This works for PostgreSQL. Please check MySQL and SQLite
+ // query( \"CREATE INDEX directories_dir ON directories( dir );\" );
+ // Remove -->
}


@@ -525,14 +533,17 @@
{
// create music statistics database
query( QString( \"CREATE TABLE statistics (\"
- \"url \" + textColumnType() + \" UNIQUE,\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"createdate INTEGER,\"
\"accessdate INTEGER,\"
\"percentage FLOAT,\"
\"rating INTEGER DEFAULT 0,\"
\"playcounter INTEGER );\" ) );

- query( \"CREATE INDEX url_stats ON statistics( url );\" );
+ // <- Remove
+ // Index not needed since a primary key uses a default index
+ // query( \"CREATE INDEX url_stats ON statistics( url );\" );
+ // --> Remove
query( \"CREATE INDEX percentage_stats ON statistics( percentage );\" );
query( \"CREATE INDEX rating_stats ON statistics( rating );\" );
query( \"CREATE INDEX playcounter_stats ON statistics( playcounter );\" );
@@ -551,22 +562,25 @@
{
// create amazon table
query( \"CREATE TABLE amazon ( \"
- \"asin \" + textColumnType(20) + \", \"
+ \"asin \" + textColumnType(20) + \" PRIMARY KEY, \"
\"locale \" + textColumnType(2) + \", \"
\"filename \" + textColumnType(33) + \", \"
\"refetchdate INTEGER );\" );

// create lyrics table
query( QString( \"CREATE TABLE lyrics (\"
- \"url \" + textColumnType() + \",\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"lyrics \" + longTextColumnType() + \"«»);\" ) );

// create labels table
query( QString( \"CREATE TABLE label (\"
- \"url \" + textColumnType() + \",\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"label \" + textColumnType() + \"«»);\" ) );

- query( \"CREATE INDEX url_label ON label( url );\" );
+ // <- Remove
+ // Not needed since a primary key is an index
+ // query( \"CREATE INDEX url_label ON label( url );\" );
+ // --> Remove
query( \"CREATE INDEX label_label ON label( label );\" );
}

@@ -590,7 +604,7 @@

// create podcast channels table
query( QString( \"CREATE TABLE podcastchannels (\"
- \"url \" + textColumnType() + \" UNIQUE,\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"title \" + textColumnType() + \",\"
\"weblink \" + textColumnType() + \",\"
\"image \" + textColumnType() + \",\"
@@ -604,7 +618,7 @@
// create podcast episodes table
query( QString( \"CREATE TABLE podcastepisodes (\"
\"id INTEGER PRIMARY KEY %1, \"
- \"url \" + textColumnType() + \" UNIQUE,\"
+ \"url \" + textColumnType() + \" PRIMARY KEY,\"
\"localurl \" + textColumnType() + \",\"
\"parent \" + textColumnType() + \",\"
\"guid \" + textColumnType() + \",\"
@@ -626,8 +640,11 @@
\"parent INTEGER, isOpen BOOL );\" )
.arg( podcastFolderAutoInc ) );

- query( \"CREATE INDEX url_podchannel ON podcastchannels( url );\" );
- query( \"CREATE INDEX url_podepisode ON podcastepisodes( url );\" );
+ // <-Remove
+ // query( \"CREATE INDEX url_podchannel ON podcastchannels( url );\" );
+ // query( \"CREATE INDEX url_podepisode ON podcastepisodes( url );\" );
+ // --> Remove
+
query( \"CREATE INDEX localurl_podepisode ON podcastepisodes( localurl );\" );
query( \"CREATE INDEX url_podfolder ON podcastfolders( id );\" );
}
@@ -821,12 +838,21 @@
{
if ( isValue)
{
- return query( QString( \"SELECT tags.url FROM tags INNER JOIN artist ON artist.id=tags.artist INNER JOIN album ON \"
- \"album.id=tags.album WHERE (album.name = \\\"%1\\\" ) AND (artist.name = \\\"%2\\\" ) ORDER BY tags.discnumber, tags.track;\" )
+ return query( QString( \"SELECT tags.url FROM tags \"
+ \"LEFT JOIN artist ON artist.id=tags.artist \"
+ \"LEFT JOIN album ON album.id=tags.album \"
+ \"WHERE (album.name = \\\"%1\\\" ) AND (artist.name = \\\"%2\\\" ) \"
+ \"ORDER BY tags.discnumber, tags.track;\" )
.arg( album_id )
.arg( artist_id ) );
}

+ // Note by Jean Pégésql
+ // This query performs a JOIN on table year, without using the result.
+ // Here the JOIN on table year is not needed.
+ // I am not rewriting the JOIN before being sure what is this query for.
+ // It seems completely unreal. Look at the second query. The %1 is lacking.
+
if (getDbConnectionType() == DbConnection::«»postgresql) {
return query( QString( \"SELECT tags.url, tags.track AS __discard FROM tags, year WHERE tags.album = %1 AND \"
\"( tags.sampler = %2 OR tags.artist = %3 ) AND year.id = tags.year \"
@@ -848,8 +874,9 @@
QStringList
CollectionDB::artistTracks( const QString &artist_id )
{
- return query( QString( \"SELECT tags.url FROM tags, album \"
- \"WHERE tags.artist = \'%1\' AND album.id = tags.album \"
+ return query( QString( \"SELECT tags.url FROM tags \"
+ \"LEFT JOIN album ON album.id = tags.album \"
+ \"WHERE tags.artist = \'%1\' \"
\"ORDER BY album.name, tags.discnumber, tags.track;\" )
.arg( artist_id ) );
}
@@ -1340,20 +1367,23 @@
if ( artist == i18n(\"Various Artists\"«») ) {
// VAs need special handling to not match on artist name but instead check for sampler flag
values = query( QString(
- \"SELECT embed.hash, embed.url FROM tags, embed, album \"
- \"WHERE tags.url = embed.url \"
- \"AND tags.album = album.id \"
- \"AND album.name = \'%1\' \"
+ \"SELECT embed.hash, embed.url \"
+ \"FROM embed \"
+ \"LEFT JOIN tags ON tags.url = embed.url \"
+ \"LEFT JOIN album ON tags.album = album.id \"
+ \"WHERE \"
+ \"album.name = \'%1\' \"
\"AND tags.sampler = 1 \"
\"ORDER BY modifydate DESC LIMIT 1;\" )
.arg( escapeString( album ) ) );
} else {
values = query( QString(
- \"SELECT embed.hash, embed.url FROM tags, embed, artist, album \"
- \"WHERE tags.url = embed.url \"
- \"AND tags.artist = artist.id \"
- \"AND tags.album = album.id \"
- \"AND artist.name = \'%1\' \"
+ \"SELECT embed.hash, embed.url \"
+ \"FROM embed \"
+ \"LEFT JOIN tags ON tags.url = embed.url \"
+ \"LEFT JOIN artist ON tags.artist = artist.id \"
+ \"LEFT JOIN album ON tags.album = album.id \"
+ \"WHERE artist.name = \'%1\' \"
\"AND album.name = \'%2\' \"
\"ORDER BY modifydate DESC LIMIT 1;\" )
.arg( escapeString( artist ) )
@@ -1381,7 +1411,9 @@
{
QStringList values =
query( QString(
- \"SELECT embed.hash FROM tags LEFT JOIN embed ON tags.url = embed.url WHERE tags.url = \'%1\' LIMIT 1;\" )
+ \"SELECT embed.hash FROM tags \"
+ \"LEFT JOIN embed ON tags.url = embed.url \"
+ \"WHERE tags.url = \'%1\' LIMIT 1;\" )
.arg( escapeString( trackInformation.url().path() ) ) );

if ( values.empty() || !values.first().isEmpty() ) {
@@ -1545,9 +1577,12 @@
{
if (getDbConnectionType() == DbConnection::«»postgresql)
{
- return query( \"SELECT DISTINCT album.name, lower( album.name ) AS __discard FROM tags, album, artist WHERE \"
- \"tags.album = album.id AND tags.artist = artist.id \"
- \"AND artist.name = \'\" + escapeString( artist ) + \"\' \" +
+ // Note by pégésql: second query should also be migrated to LEFT JOINs
+ return query( \"SELECT DISTINCT album.name FROM tags \"
+ \"LEFT JOIN album ON tags.album = album.id \"
+ \"LEFT JOIN artist ON tags.artist = artist.id \"
+ \"WHERE \"
+ \"artist.name = \'\" + escapeString( artist ) + \"\' \" +
( withUnknown ? QString::null : \"AND album.name <> \'\' \" ) +
( withCompilations ? QString::null : \"AND tags.sampler = \" + boolF() ) +
\" ORDER BY lower( album.name );\" );
@@ -1569,8 +1604,13 @@
{
if (getDbConnectionType() == DbConnection::«»postgresql)
{
- return query( \"SELECT DISTINCT artist.name, album.name, lower( album.name ) AS __discard FROM tags, album, artist WHERE \"
- \"tags.album = album.id AND tags.artist = artist.id \" +
+ // Note by pégésql: second query should also be migrated to LEFT JOINs
+ // This query can be very intensive, since it may analyse all entries from table tags without any real CLAUSE restriction.
+ // IMHO, it should be replaced by something mode precise.
+ return query( \"SELECT DISTINCT artist.name, album.name FROM tags \"
+ \"LEFT JOIN album ON tags.album = album.id \"
+ \"LEFT JOIN artist ON tags.artist = artist.id \"
+ \"WHERE \" +
( withUnknown ? QString::null : \"AND album.name <> \'\' AND artist.name <> \'\' \" ) +
( withCompilations ? QString::null : \"AND tags.sampler = \" + boolF() ) +
\" ORDER BY lower( album.name );\" );
@@ -2098,9 +2138,12 @@
\"year.name, tags.comment, tags.discnumber, tags.composer, \"
\"tags.track, tags.bitrate, tags.length, tags.samplerate, \"
\"tags.filesize, tags.filetype \"
- \"FROM tags, album, artist, genre, year \"
- \"WHERE album.id = tags.album AND artist.id = tags.artist AND \"
- \"genre.id = tags.genre AND year.id = tags.year AND tags.url = \'%1\';\" )
+ \"FROM tags \"
+ \"LEFT JOIN album ON album.id = tags.album \"
+ \"LEFT JOIN artist ON artist.id = tags.artist \"
+ \"LEFT JOIN genre ON genre.id = tags.genre \"
+ \"LEFT JOIN year.id = tags.year \"
+ \"WHERE tags.url = \'%1\';\" )
.arg( escapeString( bundle->url().path() ) ) );

if ( !values.empty() )
@@ -2773,7 +2816,9 @@
QStringList artists;
QStringList dirs;

- albums = query( QString( \"SELECT DISTINCT album.name FROM tags_temp, album%1 AS album WHERE tags_temp.dir = \'%2\' AND album.id = tags_temp.album;\" )
+ albums = query( QString( \"SELECT DISTINCT album.name \"
+ \"FROM tags_temp, album%1 AS album \"
+ \"WHERE tags_temp.dir = \'%2\' AND album.id = tags_temp.album;\" )
.arg( temporary ? \"_temp\" : \"\" )
.arg( escapeString( path ) ));

@@ -2782,10 +2827,12 @@
if ( albums[ i ].isEmpty() ) continue;

const uint album_id = albumID( albums[ i ], false, temporary, false );
- artists = query( QString( \"SELECT DISTINCT artist.name FROM tags_temp, artist%1 AS artist WHERE tags_temp.album = \'%2\' AND tags_temp.artist = artist.id;\" )
+ artists = query( QString( \"SELECT DISTINCT artist.name \"
+ \"FROM tags_temp, artist%1 AS artist \"
+ \"WHERE tags_temp.album = \'%2\' AND tags_temp.artist = artist.id;\" )
.arg( temporary ? \"_temp\" : \"\" )
.arg( album_id ) );
- dirs = query( QString( \"SELECT DISTINCT dir FROM tags_temp WHERE album = \'%1\';\" )
+ dirs = query( QString( \"SELECT dir FROM tags_temp WHERE album = \'%1\';\" )
.arg( album_id ) );

if ( artists.count() > dirs.count() )
[/code:1]
Jean pégésql
Karma
0

Re:Database patch

Thu Mar 30, 2006 10:18 pm
Some questions:

- I wonder what the embed table is for. In my database, there is no embed table.

- PostgreSQL allows to set a primary key on a text column. We should check if this is allowed in MySQL and SQLite.

- PostgreSQL primary key is an implicit index. Therefore, there is no need to set another index on the same column. Check if this is the same in MySQL and SQLite.

- What is the unique key of table amazon? I cannot know it, as I have zero information about Amazon web service. Please inform me.

- LEFT JOINs are supported by the 3 databases. If there are some restrictions in MySQL and SQLite, please inform me.

- I did not understand the very bottom of the code ... which is supposed to bring \"some\" abstraction, but does nothing but confuse the developer. Can someone comment on the 20% last percent of the code. Very confusing.

I did not had time to compile Amarok. Also, I did not (yet) implement views, as this will be a second step, after we are sure that nothing is broken. Normally, views are supported by MySQL and SQLite.

Kind regards,
Jean
Jean pégé
Karma
0

Re:Database patch

Thu Mar 30, 2006 10:44 pm
Also, we must discuss what happens when a drag&drop is issued.

Presently, a query is issed for each file, which is huge on large collections. 100 files results in 100 query. This is why drag&drop does not work well.

Of course, there should be only one query in case of a drag&drop.

Please do not answer this post. This is only a reminder for IRC discussion.


Bookmarks



Who is online

Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]