Originally Posted by
Xyliah
Time has passed since my first posting here. Since then my lua knowledge increased by quite a bit I would say, but still I haven’t really a clue about efficieny and efficient use of events.
Lets take a scenario from my mind (COMBAT_LOG_EVENT_UNFILTERED compared to UNIT_AURA)
Basically my logic is this:
- less UNIT_AURA events mean less execution of my script mean less usage of memory
- whereas CLEU is fired way more often, so it should use more memory
- my conclusion: if I could use UNIT_AURA events to accomplish my aims, it would be more efficient then CLEU
Some interesting topics for discussion to be sure and I'll provide what information I can on the subjects to hopefully put you in the right direction.
First and foremost, it is very important to distinguish between the two core concepts of efficiency when discussing addons: Memory and CPU cycles. Most of the time (I'd wager virtually all of the time), when players complain about their addons "using a lot of memory" or wanting to make sure their addons "use low memory", what players really mean is they want to ensure solid performance, and the measurement to indicate performance is almost always (incorrectly) assumed to be memory.
The reality is, memory and CPU cycles in WoW in regards to Lua are dramatically divergent concepts and generally speaking, the more memory an addon or script takes up, the less CPU cycles it will use (and vice versa). The issue most players face is the notion that addon memory footprints -- or the actual memory in KB or MB that their addon(s) are using at any given time based on the in-game profiler -- is somehow a significant factor on their performance, when very rarely is this the case. In fact, memory usage of addons is almost never a good indicator of the performance impact that addon is having on the game, and instead the real impact comes from the CPU usage. Consider that memory is simply a temporary storage for some form of data, so in the case of addons, this is everything from tables to frames to variables with values. However, Lua is an exceptionally fast scripting language, and additionally, since World of Warcraft 2.0+ which added Lua 5.1 support, it uses an incremental garbage collector for all Lua script. Effectively, this means that memory that was briefly used by Lua for a period is constantly being collected and recycled, and as such, even spikes in memory from an addon or script will be quickly reclaimed by the garbage collector in very short order.
Therefore, the actual memory usage of an addon is only relevant if it is at risk of causing your memory usage for your entire computer to cap out, which is very unlikely these days with most people usually on 4+ gigs or more. The impact of a single addon, even a very expensive and highly inefficient addon, is rarely going to be over 10-20 megs, which is a pittance compared to everything else that is happening on your computer. For example, if you play wow with a single browser window open in Chrome or Firefox, chances are that single tab will be using more memory that even the worst case WoW addon during it's poor memory spikes prior to garbage collection.
The far more important impact that addons have on performance is CPU usage. The most blatant case of this is when an addon performs a massive loop or series of functions without throttling of some kind. This is noticed usually when an addon is designed to use an OnUpdate() script assigned to multiple frames, which as you know means the script will run once per OnUpdate per frame on which it attached. To alleviate this, all good addons use some form of throttling to prevent execution of the main script until a certain period of time has elapsed, like so:
{PlaceHolder}
In this example, a setting of "interval" dictates the timespan after which the next OnUpdate call should proceed with processing, otherwise each call is ignored. While there are a handful of calls made every time OnUpdate runs, they are few enough that performance is not affected, since effectively all that is being done is assigning our new timeSince value to hold the time since we've last processed, and then checking that value against our interval threshold to allow further processing or not.
Or an even better example that is rarely seen in-game because it would be noticed by an author right away but is a good illustration, is to create a never-ending loop. Just like the OnUpdate example that doesn't include a timeSince throttle gate, a perpetual loop will freeze the CPU and hijack nearly all cycles until the game crashes (very quickly thereafter):
{PlaceHolder}
That said, the majority of addons will have a CPU usage profile far below these examples, even those that utilize OnUpdate frequently. This is again because Lua is so efficient and fast, that only a process so lengthy or complicated as to take longer than full OnUpdate cycle (i.e. frame refresh) will have a negative impact on performance when tied to said OnUpdate script. Now that is certainly possible with a poorly written addon and therefore OnUpdate function, but generally speaking most OnUpdate calls related to an existing addon (such as WeakAuras) will have a throttle of some kind in effect.
What this all ultimately means is that worrying too much about optimization for WoW addons/Lua is unnecessary.
Having said all that, there are certainly some tools available to you to help analyze your code if you want to truly focus on optimization, just keep in mind this is rarely worth the effort except in extreme cases.
Profiling
You can profile both the CPU and memory usage within WoW by toggling it with the command: /console set scriptProfile "1" (then reload your UI for it to take effect).
Now you can use a variety of functions for CPU & memory, such as:
ResetCPUUsage()
UpdateAddOnCPUUsage()
GetAddOnCPUUsage(addon)
GetFunctionCPUUsage(func, subs)
GetFrameCPUUsage(frame, children)
GetEventCPUUsage(event)
UpdateAddOnMemoryUsage()
GetAddOnMemoryUsage(addon)
They're all pretty self-explanatory by the names and parameters, but the basic usage is once you enable profiling, you then must make a call to the UpdateAddOnXXXUsage() function first, then you can call any of the other GetXXXUsage functions afterward to retrieve that data.
As for your specific examples of using CLEU vs. UNIT_AURA, they are a little confusing. You make the original (correct) claim that UNIT_AURA will be executed less-frequently than CLEU, since by its very nature, CLEU will include all events that would trigger UNIT_AURA. However, later in you example of your new function, you mentioned changing it to utilize CLEU due to an inability to track aura refresh. However, I am quite certain that UNIT_AURA does in fact track all forms of aura events (gain, fade, refresh, etc) since that event is even what the default Blizzard buff frames utilize.
Putting that aside for a moment, while you are correct that the initial example of a double-loop does cause a lot of repetition, it isn't as inefficient as you might think. Remember, while this code you included isn't a complete function, as you say I'll assume it is just run in a WeakAura display somewhere and is based on running "every frame", which is tied to the WeakAuras OnUpdate script. I don't know what the default throttle is for OnUpdate in WeakAuras, but you can just start arbitrarily picking values to compare to and see what the results are. If we assume a very fast throttle of 0.05 seconds, this means that we're getting 20 refreshes per second, so the first function above would run 20 times a second in this scenario.
How does that compare to CLEU though? You might be surprised at how often the combat log is called in a typical raid environment. Below I picked a random log from the middle of the pack in World of Logs, and picked a random page from it showing all combat log events. Starting at the beginning of Page 50 you can see the timestamp and the number of combat log events just piling up. To see the total events fired within a single random second started on Page 50, we have to actually look on the next page, Page 51, to see everything within that 1-second period. All told, between the timestamps 21:53:56.713 and 21:53:57.683, there were a total of 252 CLEU events fired. This means that, on the base of it, even a very fast assumed OnUpdate throttle of 0.05 seconds at 20 times a second is only 8% of the processes per second that we'd get from a CLEU parse in this random sample.
Therefore, the question then becomes: Is running the first function 8% as often as the second function more or less efficient? That's very difficult to answer by looking at the code and not testing it yourself, but my gut feeling is that the first example will take fewer CPU cycles over time given the infrequency of being called compared to the CLEU example. You could also look through the number of executed lines on average to get a sense as well. The first example, while not as efficient as it could be by a long shot, is very simple.
Code:
for i=1,numPlayers do
A simple loop, very fast to process.
Code:
local unit = prefix..i
A poor use of concatenation instead of StringBuilder/string.format, but using a local variable is always good.
Another simple loop call.
Code:
local _, _, _, _, _, duration, expirationTime, unitCaster, _, _, spellId = UnitAura(unit, a)
Function call and local variable assignments.
Code:
if spellId and spellId == ReMistsId then
The first big gate in the code; this is where the majority of loops will fail. So for each of our 1000 iterations in the entire double-loop, the only calls ever made are effectively UnitAura(), the returned variable assignments, and then two value comparisons. Again this code isn't as efficient as it could be (more on that later), but that's a pretty quick-and-dirty way to analyze the efficiency of something, and while a lot of looping seems inefficient, because it's tied to a throttled process of OnUpdate, we're only talking about 20,000 calls a second (sounds high, but isn't).
The second function is a lot more complicated and thus more difficult to analyze step-by-step, but we start at the top:
Code:
if not WA then WA = {} end
if not WA.HoT then WA.HoT = {} end
local player_guid = UnitGUID('PLAYER')
local spell = GetSpellInfo(774)
local now = GetTime()
WA.HoT.targets = WA.HoT.targets or {}
local HoT_targets = WA.HoT.targets
WA.HoT.ordered_targets = WA.HoT.ordered_targets or {}
local HoT_ordered_targets = WA.HoT.ordered_targets
local function SortTable(sourceTable, destTable)
if type(sourceTable) == 'table' and type(destTable) == 'table' then
wipe(destTable)
for k,v in pairs(sourceTable) do
if v >= now then
destTable[#destTable + 1] = { k = k, v = v }
else
sourceTable[k] = nil
end
end
table.sort(destTable, function(v1, v2) return v1.v < v2.v end)
for i =1, #destTable do
destTable[i] = destTable[i].k
end
local count = table.getn(HoT_ordered_targets)
WA.HoT.count = count
end
end
So early on there are a lot of variable assignments and checks. It should also be noted that WA is a global function and thus inherently is less efficient to use than a local (though often that's unavoidable especially working with WeakAuras, and as mentioned in the beginning, rarely worth worrying over if it cannot be helped). There I count about 14 comparisons and assignments in total.
Code:
if select(14, ...) == spell and select(5, ...) == player_guid then
Next is our first gate, and as with the previous example, the vast majority will not pass this comparison check, so we'll move on.
Code:
local count = WA.HoT.count
if count == 1 then
WA.HoT.shortest = HoT_targets[HoT_ordered_targets[1]]
elseif count >= 2 then
-- smooth animation (some stuttering caused by 'hide on expire' in the main WeakAuras.lua)
local adjusted = nil
if time_left(HoT_targets[HoT_ordered_targets[1]]) < 0.1 then
for i=2, count do -- start loop with 2nd entry
if time_left(HoT_targets[HoT_ordered_targets[i]]) > 0.1 then
adjusted = i
break
end
end
if adjusted then
WA.HoT.shortest = HoT_targets[HoT_ordered_targets[adjusted]]
else
WA.HoT.shortest = HoT_targets[HoT_ordered_targets[1]]
end
end
end
if count and count > 0 then
return true
end
Another 5 or so comparisons/assignments to end it out. So if we assume none of the comparison gates ever allow further processing, it's still a safe bet that you're making approximately 20 calls/comparisons in total baseline just by executing this code. Using the random 252 CLEU calls per second mentioned above, this gives a baseline average of about 5,040 calls per second. So that's good right? That's 25% of the calls per second of the first function. However, you must keep in mind this is the baseline and assumes no buffs are ever found. Now your function is clearly more complicated (and hopefully for good reason) than the first function, but this comes at a cost. You are making a lot more expensive calls in your function (again, many calls to globals throughout, sort functions, table creations, etc) especially once a bit of data is actually gathered. For example, once buffs are detected, there are a lot more calls being made (15+) each iteration until that count drops again. Suddenly your call count is 8000-9000+ per second, and again that's baseline from a random log section. What happens when A LOT of stuff is going on and the number of CLEU events for a given period jumps up 4-5 times? If you've ever been in a raid and had a sudden drop in FPS when a lot of stuff is going on, oftentimes that is due to this very issue, where some processing is trying to keep up with the huge number of CLEU events occurring in a short timespan.
The bottom line is, as mentioned early on, optimization is rarely worth over thinking, as more often than not, you'll simply over complicate your code and make it worse than it was. Even if that's not the case, the extra effort is likely not worth the performance gain, whether that gain is even real or simply perceived. In the case of both your example functions, as I said I would imagine neither is really vastly superior to the other, but that really isn't the point: The point is the calls that fire these functions, which is the real determining factor in how much processing is going on. Almost always, a throttled method is better than a lower-average but higher-spike method. While clearly your custom function is meant to do more than just check for the existence of a particular buff, it does seem overly complex, so again I would caution going too far down that road and causing more harm than good.
My solution in this case? Use a modified version of the first function, firing from either the UNIT_AURA or OnUpdate event. The biggest change is simply to make a UnitAura call using the spellName instead of the spellId, which then completely eliminates the need for aura looping:
{PlaceHolder}
Now we've cleaned up the code a great deal, so our base loops drop from 1000 for our double-loop down to 25 to simply iterate over all the raid members and evaluate if they have the aura in question. With about 5 calls being made each loop iteration plus the extra assignment before the loop, that's 25 * 5 + 1 = 76 * 20 = about 2,520 calls per second if we're using an OnUpdate at 0.05 seconds threshold. Or, we can also use UNIT_AURA which again is risky with high variance depending how many event calls occur in a given period, but it would also dramatically reduce the number of calls made since we can extract the UnitId from the event and therefore not iterate over the 25 raiders:
{PlaceHolder}
Now we're down to 5-6 calls per execution in total, which is just however many UNIT_AURA events are in a given second. For just UNIT_AURA events using that same timestamp and log from the previous example, we can see a total of 75 UNIT_AURA events in that same second period. So we're looking at around 375 total calls per second baseline with the above example using UNIT_AURA.
Anyway, I hope that helps a bit. If you want some more information, I strongly suggest finding a copy of Beginning Lua with World of Warcraft Addons, which is a great (albeit old) book with a ton of useful information about Lua and how it can be utilized in WoW. Unfortunately I wouldn't have the first idea of where you might look for such a thing, but it's a good read if you can manage it.