Problem: Our Screenshot tests CI job is running too long. - The build takes between 38 and 46 minutes. - We are close from the 50min timeout and will soon have this issue all the time - out of last 10 builds, 20% of the them timed out and took more than 50minutes.
Goal: - the UI tests should run ideally in less than 20 or 25 minutes
Solution: - maybe we split the job into two separate job each running half of the uI tests - or any idea ?
I worked on integrating Docker in Piwik lately. It could have a lot of nice outcomes (replace Vagrant, automatically deploy branches in staging/demo very cheaply - e.g. 100 different branches/demos deployed on a simple VPS - a container not being used has 0 overhead, etc.), but the main one is targeted at tests.
I now have Piwik fully working in Docker and have been trying out running tests in parallel: docker offers isolation between "VM" (not really vm, that's the point) with the advantage of being very cheap and fast to setup and run (no overhead).
I gave a try to running 4 specs (between 30s and 60s each): - running them in sequence (behavior today) takes 3 minutes - running them in parallel takes 1 minute and 10s (it's the time of the slowest spec…)
The machine I used to run the tests has 2 cores (so I'm pushing it a bit…) and 4 Gb of RAM. Each spec is slower by between 20% to 40% when run in parallel: I expect this is because I run 4 tests with 2 cores only. By running one spec for one core I think they shouldn't be any slower, but even with 20%-40% slower the total time is still better so it's fine.
The good thing with this solution is that it scales. We could buy a 12 cores server and run 12 tests in parallel (over-simplification: 40 minutes/12 = 3.3 minutes…), with the setup time being constant and minimal: we setup Piwik only once (unlike if we were to split in different jobs on Travis). That could also help making this build much faster.
Also, there are other very interesting technologies like Docker Machine that allows to run the same command, but on a distant machine (like Vagrant and its providers). Running the tests on AWS or whatever (e.g. our CI instance if it existed) wouldn't require anything other than configuring Docker Machine (no code to write, no separate command, etc…). And it would be extremely fast (because of the parallelization). But here I am seeing far ahead…
So to sum up, yes that means some work. But UI tests and Travis have been a problem forever, when we sum all the time spent on this it would be much more worth it spending time for a nice solution :)
If interested I can push my current branch with Docker. I'll keep looking into this in the meantime.
I would seriously not want to maintain and host our own CI environment. When I needed quick result I just used our
tests:run-aws command where UI tests usually finish within 20 minutes or so.
Before doing any more on this I'd to 2 things: - Split UI build into 2 builds. This is easy to do and will help us immediately - Profile UI tests where the time is spent (not in detail with any big tools). I'd just log the time needed to - compare screenshots - load pages - wait time - ...
Currently we do not even know where most of the time is spent and what makes it slow. Before thinking about different solutions we should try to understand what is actually going on with our current tests. To measure it we could use some simple "timers" and just increase them after each test run and add the end log the needed time. I'd also log some details of the Travis CI VM eg which CPUs, how many and how much RAM. So when analyzing the logged data we can compare the results a bit better.
FYI: I tried to have a look what is slow without having to sum the time manually via remote debugging (https://drupalize.me/blog/201410/using-remote-debugger-casperjs-and-phantomjs) but didn't work for me.
Update: I kinda got it working and will see if I get something out of this.
A code change was needed to make it work. Attached some screenshots. Eg one can debug things this way step by step, eg the page-renderer but also websites. This way I kinda found at least why some things in my branch are slow. I might push the changes that were needed to remote debug
This way I kinda found at least why some things in my branch are slow. I might push the changes that were needed to remote debug
Well done - let us know what you find! For sure your changes to make remote debug possible would be nice (and also a little README extract to explain how to remote debug - or a link to URL that explains)
The UI tests CI job runs between 35min and 40min and regularly times out. Hopefully we can find something soon to help this situation - or at least split the CI job across two jobs as quick fix (unless you have other suggestion for a quick fix when we are hit by this problem soon).
What are we gonna do here? Anyone a problem with splitting them into multiple jobs (2 or more)?
I'm just going to link what I worked on a few weeks ago so that it's not lost if anybody wants to reuse it: https://github.com/piwik/piwik/compare/docker
This branch contains the following changes:
- make Piwik run on Docker (
- run UI tests in parallel:
./console tests:run-ui-docker --parallel=12
The command will run each UI test suite into isolated containers (own tmp filesystem, own database, etc.). There is a
--parallel option to set how many test suites to run in parallel. From what I have measured, fastest results are achieved with 12 parallel tests on a 8 core machine. On a 16 core machines we can of course run more in parallel.
The command works in a very non-optimized way: it runs 12 test suites in parallel and waits for all of them to finish, then starts 12 again. That means if a test suite takes 30s, and another 1'30s, then for 1 minute a container will do nothing. I haven't taken time to work on it, but optimizing it to parallelize really would improve the run time a lot (some test suites are really fast). It would be even better to run the longest test suites first, and the shortest one at the end.
With the current implementation I was able to run the UI tests in 6 minutes instead of 40 on a 8 core machine. (40 minutes was on AWS)
I think it would be possible to run it in 3-4 minutes with more cores and optimized parallelization, including git clone, etc. That's to compare with 40-45 minutes on Travis.
### Long term idea
I think an ideal solution would be to have a CI server that only runs the UI tests, and contains the UI build artifacts viewer. Yes it would be a lot of work to maintain, but a lot of work has been spent anyway in the build-artifacts UI, the Travis config, the "how to make it faster", the "run on aws" command, etc. So it's not much more that what we already invest today. Also considering all the time lost because of Travis and UI tests, it would be worth it.
The UI tests would run either on push to GitHub (web hook) or through a console command (which would replace the "run on AWS"). It would be also useful to look into Docker Machine as it's exactly what "run on AWS" is about, so maybe it can be used with minimum effort.
On top of being fast (both for CI and "locally" if we use the remote run feature), Docker would also guarantee exactly the same environment between CI and locally, thus easier debugging (and much faster debugging too).
It would also simplify running UI tests as one could either run them with Docker (need to install Docker) or through the remote command (the replacement for "run on AWS").
Also having a staging deployment for each branch would be doable more easily thanks to Docker (that's something I mentioned would be very useful to validate new features, or review UI changes).
To conclude, I'm just leaving this here to present the results and explain the idea. Feel free to reuse it or not. I'm not arguing for anything, I'm just documenting.
@diosmosis do you see a problem with splitting UI tests into two parts like this: https://github.com/piwik/piwik/compare/8222?expand=1#diff-354f30a63fb0907d4ad57269548329e3R33
Not sure re possible side effects for plugins etc
FYI: I only disabled comparison of travis-yml https://github.com/piwik/piwik/compare/8222?expand=1#diff-354f30a63fb0907d4ad57269548329e3L93 for test purposes to keep things simple.
One job running all UI test usually takes between 35 and 45 minutes I think.
Running 2 test suites running about 50% of the tests each took about
136 tests (15min) including setup 17minutes and
168 tests (22min) including setup 24minutes. We could balance it a bit better to actually spend about the same amount of time in each job. UI tests should then take about 20 minutes. We could even split it into 3 UI test groups and run
UIIntegrationTestSuite in a separate job.
It's not perfect solution but looks like easy to do and would help right now.
I don't think it's necessary to split the build for plugins, just for core. So you don't need to edit the travis scripts to add the job for plugins, just add the command line options to the test system, and change the matrix in the core .travis.yml file.