This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Extracting units from a source

Tags: units, source, regular expression, sweden units, source, regular expression, sweden units, source, regular expression, sweden
(comma "," separated)
User avatar
zynex
Registered Member
Posts
15
Karma
0
OS

Extracting units from a source

Thu Sep 13, 2012 5:01 pm
Hi.

Recently Morningstar stopped having founds found in Sweden, so I had to create a new source that download the units from the Swedish Morningstar. The problem is that Scrooge can't handle the price downloaded from morningstar.se. In Sweden we use comma instead of a dot separating integers from decimals (eg. 123,12 instead of 123.12). It seems Scrooge is hard coded to use dots?

Is there any way to change this behaviour, or use regular expression to change the comma to a dot.

Here's the regular expression I use to extract the units;
price=NAV</td><td>\s\s([^"]+)\s

It works if I use this, but then I just get the integer;
price=NAV</td><td>\s\s([^"]+),

Solution?
User avatar
smankowski
Moderator
Posts
1047
Karma
7
OS

Re: Extracting units from a source

Fri Sep 14, 2012 7:13 am
Hi,

Skrooge is able to understand values with a dot or a comma.
In France too, we use the comma as decimal separator and I am able to download values from french web site.

What is exactly the error raised?
Could you give me one URL where you try to get the value?


Skrooge, a personal finances manager powered by KDE
Image - PayPal
User avatar
zynex
Registered Member
Posts
15
Karma
0
OS

Re: Extracting units from a source

Wed Sep 19, 2012 3:05 pm
Maybe I done it wrong then? The script works when the decimal is not included. Here's the complete script;

Code: Select all
#The URL of the source. %1 will be replaced by the internet code of the unit
url=http://www.morningstar.se/Funds/Quicktake/Overview.aspx?perfid=%1

#The mode (HTML or CSV). In HTML mode, only one value will be extracted from downloaded page. In CSV mode, a value per line will be extracted.
mode=HTML

#The regular expression for the price (see http://doc.qt.nokia.com/latest/qregexp.html)
#price=vkey:\{NAV:"([^"]+)"
price=NAV</td><td>([^"]+)\s

#The regular expression for the date (see http://doc.qt.nokia.com/latest/qregexp.html)
#date=LastDate:"(\d+-\d+-\d+) 00:00:00"
date=<b>Kursdatum</b>:\s(\d+-\d+-\d+)\s

#The format of the date (see http://doc.qt.nokia.com/latest/qdate.html#fromString-2)
dateformat=yyyy-MM-dd


I have tested the regular expression against a online service, and it seems to work there. The string I want to extract the number from looks like this;

Code: Select all
<td>Senaste NAV</td><td>  122,17 SEK</td><td>2012-09-18</td>


If I use "price=NAV</td><td>([^"]+)," instead of "price=NAV</td><td>([^"]+)\s", it works. But then I don't get the decimal. The error is "Price not found for 'Avanza Zero" (in this case) with regular expression 'NAV</td><td>([^"]+)\s'.
User avatar
zynex
Registered Member
Posts
15
Karma
0
OS

Re: Extracting units from a source

Wed Sep 19, 2012 3:30 pm
I actually solved it partially.

If I use "price=NAV</td><td>([^"]+)\sSEK" instead of just using "price=NAV</td><td>([^"]+)\s", it works. The problem is that some indexes are in other currency than SEK (Swedish kronor), like USD. Then it wont work.

Shouldn’t it work with just a whitespace at the end? If the string is "123,45 SEK", witch means that there is a whiltespace between 123,45 and SEK.

I noticed that "price=NAV</td><td>([^"]+)SEK" works as well. Maby Scrooge trim all whitespaces?
User avatar
smankowski
Moderator
Posts
1047
Karma
7
OS

Re: Extracting units from a source

Wed Sep 19, 2012 7:45 pm
Hi,

For me, it works like this:

Code: Select all
#The URL of the source. %1 will be replaced by the internet code of the unit
url=http://www.morningstar.se/Funds/Quicktake/Overview.aspx?perfid=%1

#The mode (HTML or CSV). In HTML mode, only one value will be extracted from downloaded page. In CSV mode, a value per line will be extracted.
mode=HTML

#The regular expression for the price (see http://doc.qt.nokia.com/latest/qregexp.html)
#price=vkey:\{NAV:"([^"]+)"
price=NAV</td><td>\s([^<]+)\s

#The regular expression for the date (see http://doc.qt.nokia.com/latest/qregexp.html)
#date=LastDate:"(\d+-\d+-\d+) 00:00:00"
date=<b>Kursdatum</b>:\s(\d+-\d+-\d+)\s

#The format of the date (see http://doc.qt.nokia.com/latest/qdate.html#fromString-2)
dateformat=yyyy-MM-dd


Don't forget that you can publish your source on opendesktop directly from Skrooge...
If you don't have an account on opendesktop, I can do it for you if you want.


Skrooge, a personal finances manager powered by KDE
Image - PayPal
User avatar
zynex
Registered Member
Posts
15
Karma
0
OS

Re: Extracting units from a source

Thu Sep 20, 2012 8:30 pm
Sweet, works like a charm :)

Didn't know that, how do I do that? Can't find any option for it?
User avatar
smankowski
Moderator
Posts
1047
Karma
7
OS

Re: Extracting units from a source

Thu Sep 20, 2012 8:37 pm
I think that your question is "How to push the new source on opendesktop?".
It's simple, on the right of the "Source" field, you can see a amber star.
A short click on it allows to download new sources.
A long click on it allows to upload your sources for everybody.


Skrooge, a personal finances manager powered by KDE
Image - PayPal
User avatar
zynex
Registered Member
Posts
15
Karma
0
OS

Re: Extracting units from a source

Thu Sep 20, 2012 9:11 pm
You learn something new every day :)

I uploaded it thru the website this time, but it's good to know how to easily upload stuff from apps :)


Bookmarks



Who is online

Registered users: Bing [Bot], claydoh, Evergrowing, Google [Bot], rblackwell