@tsteur opened this issue on April 7th 2015

Eg in Tracker mode 14% of my wall time is spent in Db::logSql() even though it doesn't actually log anything because of the log level. Also when archiving and thousands of queries are triggered it spends quite a lot of time there in total.

We discussed this about 2 weeks ago in the chat and an idea is to use introduce a new config setting [Debug] log_sql_queries.

Maybe in tracker we should always use the NullLogger and no handlers or so for best performance unless [Tracker] debug = 1 is enabled? Not sure!

@diosmosis commented on April 9th 2015

Do you remember how many times the method was called, total wall time and whether any specific methods called by Db::logSql took up most of the time? Also, isn't the tracker Db different from \Piwik\Db? I guess some parts call Db::, but I was under assumption most tracker code didn't.

@tsteur commented on April 9th 2015

Another run shows 120ms total run. There's a method Piwik\Tracker\Model::getIdsAction that does a fetchAll that triggered the logSql. It still uses some part of Db:: as tracker Db extends this one. Anyway, even in normal UI or when archiving it makes things slower.

@diosmosis commented on April 9th 2015

120ms out of what total? And that's just for logSql? Did any functions called somewhere within logSql's execution take a lot of time or is it just the if statement in Logger?

I don't really care if logSql is run or not (it was a PRO developer request from the last meetup), but it seems strange that what should essentially be just if (!shouldLog($level)) { return; } results in noticeably slower execution. If there is an issue here, it bears investigation.

Did you test w/o xhprof/xdebug while doing a large bulk track?

@tsteur commented on April 10th 2015

It takes about 14% of 120ms total time.

I tested with xhprof but without xdebug, no bulk. I do not have the run here anymore to see what exactly was slow.

@diosmosis commented on April 10th 2015

For posterity/future work: Tracker\Db.php doesn't inherit from Piwik\Db.php, it's possible something's using Piwik\Db in the tracker incorrectly.

@diosmosis commented on April 17th 2015

I did a bulk tracking test (300 actions / 100 visits), and Db::logSql is only called 20 times and took .07% of wall time. It's only called when autoloading Option values in Cache::getCacheGeneral. Making sure it uses the correct Db will fix this.

@tsteur commented on April 17th 2015

Why bulk tracking? Can you try with a normal tracking request? And xdebug disabled + development mode enabled and caches warmed up and it such have a bigger effect. Anyhow, if it is fixed when using the Tracker\Db then it is fine. It also had an effect when archiving and using normal Db see issue description but don't remember how much it was.

@diosmosis commented on April 17th 2015

I used bulk tracking to see if it would have a bigger effect. I used w/ xdebug disabled + development mode enabled. I don't know what you mean by caches warmed up?

FYI, I used blackfire, so I think I can send you a link.

It makes sense that archiving would have an issue, it doesn't use the Tracker\Db class.

@diosmosis commented on April 17th 2015

W/ one tracking request, it still is only called by Option::autoload & only called 20 times. It represents a bigger portion of the time, but still only 1.3%

@diosmosis commented on April 18th 2015

On further examination, Option::autoload is only called by the tracker when running scheduled tasks. So nothing to do tracker side.

I'm looking at core:archive profiles.

@diosmosis commented on April 18th 2015

During a core:archive command, logSql takes up .07% of the total time. My related config is:

[Development]
enabled = 1
disable_merged_assets = 1

[log]
log_writers[] = file
log_level = INFO

Can you confirm the results you last found, and perhaps share the profile?

@tsteur commented on April 19th 2015

You should really disable development mode when profiling anything as it gives you wrong results. It will spend more time in things that are usually cached etc. With warming up the cache I mean make sure to perform one or two requests before actually profiling to make sure everything is cached.

I don't use blackfire but used it to quickly upload one run from xhprof https://blackfire.io/profiles/b17f20df-2ec4-4c4f-b930-636768eefc6e/graph

When you test the archiver, do you have many sites, with like 1mio actual urls, many different actions, also lots of other data etc? Which period did you profile? Sometimes archiving of one period for one site takes 50 minutes. Even 1% can be actually quite a bit then although it is surely not much.

@diosmosis commented on April 19th 2015

You should really disable development mode when profiling anything as it gives you wrong results.

Did not know that, I assumed from your comment above I should run it with development mode enabled.

I don't use blackfire but used it to quickly upload one run from xhprof https://blackfire.io/profiles/b17f20df-2ec4-4c4f-b930-636768eefc6e/graph

From your profile, I see that logSql is called when autoloading Options, and only when remembering to invalidate reports. I also see that most of the time comes not from logSql, but Log::getInstance(). So going through DI to setup the logger is the problem here, at least for the tracker. I'll look into why the normal Db is being used when loading options.

When you test the archiver, do you have many sites, with like 1mio actual urls, many different actions, also lots of other data etc? Which period did you profile?

I had 20 sites, and was using demo2db's data, but only doing archiving for one month of it, so it wasn't a lot. I think doing the profile you did would be rather complicated to do again, do you still have the profile? In the tracker the issue is setting up the logger, but this should already be done for archiving, so I'm not sure why an if (logLevel > levelToLog) would be even a slightly expensive operation in this case. The profile would help in figuring this out.

In the future, it would be very useful for these profiles to be uploaded and linked in issues like these.

@tsteur commented on April 20th 2015

Sorry I meant development mode disabled and caches warmed up.

I don't have the other profile from archiving anymore but I should still have the profile from this morning of the tracker if you meant this? Can send it on slack.

I can upload profiles in the future, in this case I thought it wouldn't be a big deal as it shouldn't be a problem adding the requested option. In most cases the logged sql statements are not needed at all and when needed it could be simply enabled I thought. It would help of course when a wrong DB is used. Didn't think it would use the wrong one.

@diosmosis commented on April 20th 2015

I don't have the other profile from archiving anymore but I should still have the profile from this morning of the tracker if you meant this? Can send it on slack.

The tracker is easy to get, the archiving one would be good to see. I'll try and replicate.

I can upload profiles in the future, in this case I thought it wouldn't be a big deal as it shouldn't be a problem adding the requested option. In most cases the logged sql statements are not needed at all and when needed it could be simply enabled I thought.

I'm not sure I see the reason for adding the option, since those logs should only be logged via DEBUG level. Adding the option might help us when debugging (to clear a lot of the crap), but putting it in right now might hide the real problem we haven't found yet (at least for archiving).

I'll try to run a profile on EC2 for archiving. Did you use a specific data dump (maybe the data on testing-big?)?

@tsteur commented on April 20th 2015

yeah testing big. Warning it can take some hours to archive :)

If adding this option is a problem I suggest to just move this issue to mid term or so. In my mind it was ok to just enable that option if needed but if this is not the case, the time spent on this issue is probably not worth it and we can improve somewhere else even more. I mean re the archiver.

The tracker should be fixed though.

@diosmosis commented on April 20th 2015

Adding the INI option is fine, but I want to find the underlying performance issue first, just so we don't accidentally cover it up.

@mnapoli commented on April 20th 2015

By the way remember also to optimize the autoloader (composer dump-autoload -o) it makes a very big difference in performances too.

@diosmosis commented on April 20th 2015

Added some optimizations in #7711, feel free to review/merge.

This issue was closed on April 22nd 2015
Powered by GitHub Issue Mirror