#Possible to further improve performance on class initialization?

1 messages · Page 1 of 1 (latest)

clear skiff
#

I have about ~100 intefaces which look like this:

interface EmojiArrow {
    Emoji UP_ARROW = new Emoji("⬆️", "\\u2B06\\uFE0F", "⬆️", "⬆️", "%E2%AC%86%EF%B8%8F", Collections.unmodifiableList(Arrays.asList(":arrow_up:", ":up_arrow:")), Collections.singletonList(":arrow_up:"), Collections.singletonList(":arrow_up:"), Collections.unmodifiableList(Arrays.asList("arrow", "cardinal", "direction", "north", "up")), false, false, 0.6, FULLY_QUALIFIED, "up arrow", EmojiGroup.SYMBOLS, EmojiSubGroup.ARROW, false);
}

All intefaces contain in total 5000+ objects. Do you know any way how this could be further be improved when all interfaces are initialized? Currently it takes ~90ms.
I'm on Java 8. Out of curiosity I tried a newer Java version with List.of but this did basically nothing. The Emoji object is just a simple POJO

fiery belfryBOT
#

<@&987246399047479336> please have a look, thanks.

rain cradle
#

you care about 90ms?

clear skiff
#

This is an emoji library. Someone reported an issue with the library where the usage on old android devices leads to freezes of 1-2 seconds depending on the device. So here I am, asking if there is anything I could do to improve the initialization time any further.

rain cradle
#

then analyze where the actual issue is

clear skiff
#

Not as easy as it sounds. I already tried to profile it but I don't get a lot of information and as far as I've seen it's pretty complex to find the actual causes. Therefore, I am asking here if someone knows how to improve this because maybe someone already had a similar issue.

rain cradle
#

I dont see why you are trying to improve some random things if you don't know where the lag is coming from

clear skiff
#

Random things? The issue occurs when initializing the interfaces

rain cradle
#

how do you know?

#

didn't really came clear based on your messages

clear skiff
#

Post title: Possible to further improve performance on class initialization?
Body: Do you know any way how this could be further be improved when all interfaces are initialized?

rain cradle
#

you didn't understandt, that is what you are trying to do, but do you know that the initializing is causing a lag spike by proving with a profiler?

clear skiff
#

The user was using another library before. There weren't any issues and it took 30ms on a pc like mine does 90ms.
I already reduced the initialization time from the library from 240ms to now 90ms and it improved the startup time greatly on old android devices. But the only difference would be that my library loads all interfaces containing in total 5000 objects, instead of reading a file and parsing them to objects

grim vigil
#

How are you measuring it, how are you removing random variance?

#

OTEL/JMH/...?

clear skiff
#

JMH

minor lark
#

Arrays.asList allocates an array - thats fine, and my first guess would be "what about List.of since that might not allocate an array

#

another option is to lazily initialize per constant or something

#

like - and forgive how horrible this is for the moment

#
interface EmojiArrow {
    // Make an Emoji subtype that can defer initialization
    Emoji UP_ARROW = new LazyEmoji(() -> new Emoji("⬆️", "\\u2B06\\uFE0F", "&#11014;&#65039;", "&#x2B06;&#xFE0F;", "%E2%AC%86%EF%B8%8F", Collections.unmodifiableList(Arrays.asList(":arrow_up:", ":up_arrow:")), Collections.singletonList(":arrow_up:"), Collections.singletonList(":arrow_up:"), Collections.unmodifiableList(Arrays.asList("arrow", "cardinal", "direction", "north", "up")), false, false, 0.6, FULLY_QUALIFIED, "up arrow", EmojiGroup.SYMBOLS, EmojiSubGroup.ARROW, false));
}
#

that would help if the problem is just in the actual calls to Collections

#

if its the pure number of classes and running their static initializers maybe also

clear skiff
minor lark
#
final class InternalBigEmojiClass {
    Emoji UP_ARROW = ...;
}

interface EmojiArrow {
    Emoji UP_ARROW = InternalBigEmojiClass.UP_ARROW;
}
#

so this would help if its not just running the initializers, but the sheer amount of classes to initialize

surreal pebble
#

Is this Java 8?

minor lark
surreal pebble
#

so, yeah?

minor lark
#

so we on old java + not even the jvm really

clear skiff
#

Yes, Java 8

surreal pebble
#

I'd definitely use immutable lists

#

listOf in Kotlin

clear skiff
#

And the user who reported this issue is using the library on android

surreal pebble
#

What does Emoji look like?

#

or just share your GH

minor lark
clear skiff
#

It's a simple POJO

minor lark
#

because it would help us see if the LazyEmoji might be applicable

surreal pebble
#

But, fr, I'd stick all of them in JSON, and just yeet that entire thing in memory

#

Probably faster

#

There's an interface per emoji?

minor lark
rain cradle
#

per group I thin lol

minor lark
#

and that most of it is auto generated from some files

surreal pebble
#

yeah, something I've learned dealing with low power devices

#

The more class files you have, the slower it gets

#

the amount of class files really impacts speed

#

classes, but point still stands

clear skiff
#
    Emoji(
            final String emoji,
            final String unicode,
            final String htmlDec,
            final String htmlHex,
            final String urlEncoded,
            final List<String> discordAliases,
            final List<String> slackAliases,
            final List<String> githubAliases,
            final List<String> keywords,
            final boolean hasFitzpatrick,
            final boolean hasHairStyle,
            final double version,
            final Qualification qualification,
            final String description,
            final EmojiGroup group,
            final EmojiSubGroup subgroup,
            final boolean hasVariationSelectors) {
        this.emoji = emoji;
        this.unicode = unicode;
        this.htmlDec = htmlDec;
        this.htmlHex = htmlHex;
        this.urlEncoded = urlEncoded;
        this.discordAliases = discordAliases;
        this.githubAliases = githubAliases;
        this.slackAliases = slackAliases;
        this.keywords = keywords;
        this.hasFitzpatrick = hasFitzpatrick;
        this.hasHairStyle = hasHairStyle;
        this.version = version;
        this.qualification = qualification;
        this.description = description;
        this.group = group;
        this.subgroup = subgroup;
        this.hasVariationSelectors = hasVariationSelectors;
        final Set<String> aliases = new HashSet<>();
        aliases.addAll(getDiscordAliases());
        aliases.addAll(getGithubAliases());
        aliases.addAll(getSlackAliases());
        allAliases = aliases;
    }
rain cradle
#

sonar wont like that file LUL

surreal pebble
#

hmm, initalizing this many sets will kill it

surreal pebble
clear skiff
rain cradle
minor lark
clear skiff
rain cradle
#

you are really misusing interfaces here

surreal pebble
#

yes, github link please

#

Also, for the love of god, if your class is supposed to be immutable, please do that in the constructor, and not outside of the class

clear skiff
surreal pebble
#

Like all those unmodifiable list calls should just be 4 copies in the constructor, and returning unmodifiablelist in the getter

clear skiff
surreal pebble
#

time to scratch "fast" from that readme 😛

minor lark
#

ah cool, yeah i'll clone. easier to doodle in an IDE

surreal pebble
#

aah, and of course it's gradle again, I'll be back in 10 minutes once it finishes loading...

minor lark
#

my life is a blur

clear skiff
surreal pebble
#

Also, android is that java 8?

clear skiff
#

I can put you on that site 😄

clear skiff
# surreal pebble Also, android is that java 8?

From the issue author

Since this is Android, we don't have a classic JRE and we don't have choice over it. On Android, a runtime called ART compiles the Java code to native code, with varying degrees of optimization depending on Android version. I am seeing relatively quick load times on an Android 15 emulator (~100ms on my host machine), but slow times on an Android 7 emulator (> 1s on the same host machine).

surreal pebble
#

OK, what language level do you target then?

#

8 or 9?

clear skiff
#

While the issue is somewhat fixed with a workaround I'm still curious if theres a way to optimize it further

#

8

surreal pebble
#

hmm, man 9 would be so nice, because it has those immutable collections and those save quite a bit of memory

#

they're also in android now, but I don't know how that all works with compatibility etc

clear skiff
#

Yeah, I can do MRJs

surreal pebble
#

yeah but if you need 8 anyway it doesn't matter

#

too much work for an MRJ

clear skiff
#

But when I tested it and replaced all the collection calls with List.of there wasn't a large benefit

#

I already have an MRJ for module support

surreal pebble
#

no, but those are exactly sized, and immutable, and have free copies

#

so they do have a substantial benefit

#

Esp the sets and maps

minor lark
#

y'all are being so rude to this man

clear skiff
#

So you would think there might be an improvement for android devices while not really being for pc?

surreal pebble
#

Hard to tell

minor lark
#

usually I would put a lambda somewhere to defer work

surreal pebble
#

What I'd definitely do first and foremost is those initializers

#

clean those up as much as possible

#

Qualification.fromString should be hardcoded to the right Qualification value

minor lark
#

thats the only way i can think of - but on such an old vm thats gonna be a class

#

as opposed to an invoke dynamic

#

so it might be a net negative

#

we defer initializing lists but load in a bunch more class files

clear skiff
minor lark
surreal pebble
#

no worries, I straight up dispise gradle

#

anyway, it loaded fine, after installing 3 more jvms, half a gig of sources and complaining my JAVA_HOME is the wrong JDK

minor lark
#

but my computer is an M1 mac, very warm at this point, and i've been waiting this whole time

surreal pebble
#

ANYWAY

clear skiff
# minor lark usually I would put a lambda somewhere to defer work

Yes, but this only defers the point where the issue would occur. The point where someone calls Emojimanager.replaceEmojis it the place where this issue will always appear as at this point all emojis need to be loaded. Before you never interact with the library, therefore its never initialized. And using a lambda just defers it with an additional call

surreal pebble
#

htmlDec, htmlHex and urlEncoded can ALL be derived quite easily from unicode

#

So I'd yeet those

minor lark
#

like

rain cradle
#

I never had problems with gradle, but also never with maven, no idea why some are hating 🤷‍♂️

minor lark
#

they initialize emojis as the program runs

surreal pebble
#

urlencoded is just utf8 bytes in hex with percent signs, htmldec and htmlhex are just decimal/hex from the codepoints

minor lark
#

but if you are saying its a general "find all emojis" thing and that does it...

clear skiff
minor lark
#

okay here's a thought

#

what if: and I know this is an extra dependency or thing to shade, but

#

what if your emoji pojos just took JsonObject and a pointer to the entry

surreal pebble
#

Remove all of these

minor lark
#

like, its a different sort of cost

#

but you don't need to consider class file sizes or class initialization

clear skiff
minor lark
#

nah that would still work

#

THUMBS_UP = new Emoji(BIG_JSON_BLOB, "thumbs_up")

#

then THUMBS_UP.getHtmlHex() does json.get(ptr).get("htmlHex")

#

so every emoji instance points to the json blob instead of initializing itself

clear skiff
#

But I also wanted a dependency free library. In the beginning I used Jackson but the decided against it and auto generate all the code

surreal pebble
#

yeah

surreal pebble
#

Listen: clear as much as you can, like I said, those first 4 fields can JUST BE REMOVED

minor lark
#

don't get mad everyone but

surreal pebble
#

Those can just be calculated on the fly

rain cradle
surreal pebble
#

That's so much string instances gone

minor lark
#

what if you just serialized a Emoji[]

#

and then loaded it in memory

clear skiff
#

So, I think I'm going to test using Java 9+, replace the list constructions with List.of and test removing these strings to see if there's any improvement

clear skiff
surreal pebble
#
    public String getUnicodeText() {
        return emoji.chars()
            .mapToObj(ch -> String.format("\\u%04x", ch))
            .collect(Collectors.joining());
    }

    public String getHtmlDecimalCode() {
        return emoji.codePoints()
            .mapToObj(cp -> "&#" + cp + ";")
            .collect(Collectors.joining());
    }

    public String getHtmlHexadecimalCode() {
        return emoji.codePoints()
            .mapToObj(cp -> "&#x" + Integer.toHexString(cp) + ";")
            .collect(Collectors.joining());
    }

    public String getURLEncoded() {
        byte[] bytes = emoji.getBytes(StandardCharsets.UTF_8);
        StringBuilder builder = new StringBuilder();
        for (byte b : bytes) {
            builder.append("%").append(String.format("%02x", Byte.toUnsignedInt(b)));
        }
        return builder.toString();
    }
#

Here's some quick and dirty implementations

clear skiff
#

And then i load each emoji individually? @minor lark

surreal pebble
#

Need some optimizations for speed

#

But should be correct

#

I'd also lazy initialize "allAliases"

#

A lot of work for nothing really

clear skiff
surreal pebble
#

ah well

#

yeah, don't store anything that is trivial to calculate

#

that's all memory usage and extra gc objects for absolutely no reason

minor lark
#

instead of the class initializers

#

i could see that being either faster or slower

#

but its similar in spirit to the json object pointer suggestion

surreal pebble
#

I'd go for a quick custom format at this point

#

OIS is slow and bulky

#

Just a few reads with DataInput and DataOutput should be good

#

give every emoji just a key, yeet them all in a giant map

remote cosmos
surreal pebble
#

Yeah this sounds like a fun project, I might try my hand at it at some point and learn how to write a maven codegen plugin

minor lark
#

maybe the ideal solution is to make your "would load the whole world of emoji classes" logic instead have hard coded ranges for characters

surreal pebble
#

Which reminds me that I should convert my json library into a single file, so it can be included easily in older java projects, so people can use json for storage all the time

minor lark
#

yeah gradle never finished initializing

clear skiff
surreal pebble
#

#justgradlethings

surreal pebble
clear skiff
surreal pebble
#

yeah, that will definitely do it, initializing all of it at the same time, add some object reading etc

minor lark
#

or actually nvm

#

just understood

#

replacing emojis with slack equivalents or whatever

clear skiff
#

You can doo all sort of thing. Detecting aliases, unicode, htmlDec, url encoded emojis etc

rain cradle
minor lark
clear skiff
clear skiff
rain cradle
rain cradle
#

having them in one single file will be cancer, but you could also split into multiple enums here like you did with interfaces

clear skiff
#

The first thing I will hit is the file size limit

minor lark
#

there really isn't a reason to use an enum, i'm confused why that keeps being suggested

#

the unicode standard keeps growing

rain cradle
#

im not suggesting it

clear skiff
#

Using enums is the "proper" way, thats why I wanted to use it first. But there are issues with extending and using multiple enums for basically the same thing wasn't working that well

clear skiff
rain cradle
#

enums can implement interfaces and you could do it that way, one generic Emoji interface and different enums (representing the different groups) that implement that generic interface

surreal pebble
rain cradle
#

though I would still not use enums in this case

surreal pebble
#

There's 3 boolean fields, if you have a fixed incrementing integer for your emojis, all of those could just be a few bitsets

clear skiff
#

but wouldn't this just save some nanoseconds?

clear skiff
rain cradle
#

uh thats something I can't answer :)

surreal pebble
#

plus reading one big bitset is a lot faster than reading and initializing a lot of separate class files

#

There's so many things you could do if emojis were just an opaque identifier

#

just have all data in a few huge lists/arrays

#

which you could just read as standard text files

#
import java.util.*;

class Emoji {
    private final EmojiManager manager;
    final int id;

    Emoji(EmojiManager manager, int id) {
        this.manager = manager;
        this.id = id;
    }

    public String getEmoji() {
        return manager.getEmoji(this);
    }

    public List<String> getDiscordAliases() {
        return manager.getDiscordAliases(this);
    }
}

enum EmojiManager {
    INSTANCE;

    private final Map<String, Integer> emojiIndex = new HashMap<>();

    // Imagine all of these are initialized
    private final List<String> emojis = new ArrayList<>();
    private final List<List<String>> discordAliases = new ArrayList<>();

    public Emoji create(String emoji) {
        int index = emojiIndex.size();
        emojiIndex.put(emoji, index);
        return new Emoji(this, index);
    }

    public String getEmoji(Emoji emoji) {
        return emojis.get(emoji.id);
    }

    public List<String> getDiscordAliases(Emoji emoji) {
        return discordAliases.get(emoji.id);
    }
}

interface Emojis {
    static Emoji BALL = EmojiManager.INSTANCE.create("<emoji>");
}
#

Filthy example

clear skiff
surreal pebble
#

But this would have all data concentrated in one spot

clear skiff
#

when talking about small embedded system this might be a lot

surreal pebble
#

A List<String> or List<List<String>> can be made really compact with a byte[] and an int[]

clear skiff
#

in your example, how does it get all the other data?

remote cosmos
clear skiff
#

Do you have the project locally?

remote cosmos
#

no

clear skiff
#

These classes are generated

remote cosmos
#

i am struggling to understand what is slow

#

there isn't many constantes

clear skiff
#

One of the larger ones are EmojiPersonActivity

#

There are 3 files for only this category

remote cosmos
clear skiff
#

That's what I did. There are 3 interfaces EmojiPersonActivityA EmojiPersonActivityB EmojiPersonActivityC because of the JVMs file size limit of 64k

remote cosmos
#

EmojiEngine.unicode(EmojiArrow.UP_ARROW)

#

sure it's a bit more annoying to write

#

but

#

loading all the constants should be much faster

#

maybe instant

#

and all the crap that come with it can be managed in some EmojiEngine class

#

where you can freely optimize it

#

without being limited by a static final field

#

and you might be able to put them all into the same class instead of 3 separate classes

clear skiff
#

Hm, but is this really going to fix the issue? Yes they are probably loaded much faster (enums compile on bytecode level similar to a final class with conastants). But then 5 lookups or something like that have to be performed for each emoji to initialize all internal maps. So If the values are in the object creation or retrieved via a map lookup isn't going to make a difference I think

#

But there have already been some easier suggestions that I'm going to try first before testing more complex things like yours

remote cosmos
#

so the user will only get what he asked for

#

and not literaly everything

clear skiff
#

@surreal pebble FYI So, removing the string arguments got an additional 15ms when initializing the classes (would have never thought that it's that much time for only strings). Unfortunately this doesn't help when they are needed anyway for internal maps and are calculated right after the interfaces are done loading.
The bitset change did basically nothing.
= 75ms for interface initialization, but still the same application performance as before when using the library for the first time 90ms.

After upgrading to Java 17 it gave me an incredible boost of an additional 15ms. So now we are at 50ms for interfaces initialization with the changes above. And then using List.of instead of the other collection calls results in the same time

#

@remote cosmos I will probably get the biggest performance boost now if I try to auto generate more classes for the other fields like version, qualification, description etc. and populate a map. But its also questionable whether lazy loading a double or enum values really saves time 🤔
So I'm left with the description and lists for aliases

remote cosmos
#

Not only that, but currently you are pretty much loading everything when the user probably only need 1% of it
Hence why you should lazy load

clear skiff
#

What I'm talking about right now is htmlDec/hex and url encoded values

remote cosmos
remote cosmos
minor lark
#

because they are talking about features that involve scanning a document for all emoji and doing replacements that are per-emoji

#

im just baffled sometimes at the reading comprehension sometimes

remote cosmos
clear skiff
#

We need to know what emojis exist before processing any text, otherwise how do we know we have an emoji?
Trying to load emojis when checking for codepoints etc. while iterating through a text is slow and inefficient because there are some other data structures to ensure fast processing

proper bloom
#

this was the first thing i saw

#

havent read the whole convo, but it seems like you are doing things eagerly

#

whats your newer code looking like?

clear skiff
#

Still looks the same atm. Only really tried this suggestion
#1341472797162995784 message

#

But it's not really viable because the computed values which are removed are needed anyway

minor lark
#

array of structs to struct of arrays this

#

like instead of making a bunch of emoji objects that all have these values set - which you can still leave open to the public api if you want - have a bunch of

#
byte[] emoji = ...;
byte[] slackReplacements = ...;
proper bloom
minor lark
#

and then each emoji just is a pointer into the various arrays + a bunch of sizes and offsets

proper bloom
#

one of those "itll be done in 2-8 weeks" type of things

minor lark
#

yes you end up building strings from scratch more often, but the initial load is faster

proper bloom
#

you mentioned it at first, "15ms save". then upgraded to 17, and got another 15ms? exactly 15ms?

#

how are you monitoring these changes?

minor lark
#
UP_ARROW = new Emoji(213, 4, 6, 7, 9); // Magic numbers pointing into big byte arrays
proper bloom
#

feels like youre just tossing numbers

#

what are you using to determine the metrics?

clear skiff
#

Somewhat around 15ms, not exactly. Sometimes it shows +-2 but I guess this also depends on the other applications I am running on my PC like live streams

#

JMH

clear skiff
clear skiff
minor lark
#

do those need them already loaded as strings

#

basically normal use of your code doesn't need things available right away, it sounds like its only these algorithms that do something for all emoji

clear skiff
#

These fields are not used for anything essential and could be lazy loaded. But questionable is also whether the booleans, double or enums require that much time in comparison to strings as shown earlier

final List<String> keywords,
final boolean hasFitzpatrick,
final boolean hasHairStyle,
final double version,
final Qualification qualification,
final String description,
final EmojiGroup group,
final EmojiSubGroup subgroup,
final boolean hasVariationSelectors
#

Removing the keywords list might also get rid of a lot of newly created strings

minor lark
#

and there isn't much to do in the way of reducing number of classes

#

string creation and, more importantly, validation - thats probably the issue

#

but i would be curious: is this an old hardware problem or an old software problem

#

your machine is probably too fast to benchmark what goes on in an old android

clear skiff
#

Something like AOT with graalvm would solve the issue. Everything is known at compile time.

minor lark
#

and might have too many new optimizations like string cache

minor lark
#

like

#

strings are pretty heavily optimized and known at compile time usually

#

now i don't know how the compile time string cache interacts with validation logic

#

but i do know that there is such a cache

#

so "a" == "a" will always be true

#

just reasons to think that to get to the bottom of this you should be testing on the hardware they are having a problem on

clear skiff
minor lark
#

since there is also the "what the fuck is the dalvik vm" doing problem

minor lark
#

but i do know that stuff changed

clear skiff
#

Impressive that they manage to get even better results with AS emulator. So there seems to be quite some improvements in the JRE or ART what they call it on android

#

ART introduces ahead-of-time (AOT) compilation, which can improve app performance. ART also has tighter install-time verification than Dalvik.
Probably that's why

proper bloom
#

AOT is great for apps that stay alive for a few days

#

when it comes to months of uptime, JIT takes the cake

#

and thats why Java is great for servers

clear skiff
#

Never said something else? Just clarified why on AS it probably takes even less time than on my pc for initialization.