W3 Strings, their IDs and Usage in Radish Projects
This article gives you a thorough overview of how adding new strings with a The Witcher 3 mod is handled by the radish modding tools. If you are unsure about the process, about IDs, their spaces and starts, or how to define custom strings for more diverse usage - you're at the right place!
- 1 Overview
- 2 Strings in RedEngine3
- 3 Strings in Radish Modding Tools
The radish modding tools are capable of encoding custom strings by means of the w3strings encoder which is included in the main tools package. As all other encoders, it doesn't need to be used directly, but is automated to a larger extent with the project template, its setup of batch scripts and in combination with the other encoders. For example, you don't need to extract every string specified in you scenes by hand. On the other hand in order for everything to work smoothly, some settings, name idspace and idstart need to be configured correctly and custom strings aside from the automatically extracted ones have their own workflow.
In this article, we'll cover several topics - but feel free to skip to the parts important to you. Every section is written in such a way that you should be able to gain information on the sub-topic without needing to re-read the whole article, if some part relies on another, it will be mentioned.
Strings in RedEngine3
We start with some background info on how strings are stored in the engine and what our options as modders are.
On .w3strings files and IDs
All strings used in the game and any additional mods/DLCs are stored in a database consisting of (several) .w3strings files. Every file consists of entries, where an entry has the following elements:
unique ID as key | optional hex key (-> based on a string key) | string as value
The range of possible IDs is 0.000.000.000. ... 232 - 1. Wherever you need a string as a user, you will specify its ID or string key (more on that and hex keys in the next section), and the game will automatically look up the string value in the database.
Adding new strings works by creating a .w3strings file in your mod: If such a file contains entries, they are added to the joint database in the game. You could also overwrite an existing string by reusing the ID/hex key of that string and assigning your replacement. Since .w3strings files are encoded in a special way, you need the w3strings encoder to create them.
An alternative to IDs: string/hex keys
Referencing a string by ID, which is very straightforward, has a downside: It is hard to know which string is used when looking at a referencing ID and it's easy to make a mistake by typing a wrong digit as a user. This is why a second mechanism exists, which allows referencing strings by string keys. Recall what an entry contains:
unique ID as key | optional hex key (-> based on a string key) | string as value
Whenever you use the optional string key, it needs to be transformed to a corresponding representation, the hex key, which is then stored in the entry of a .w3strings file. The engine than automatically handles the rest. Using a string key in a .reddlc file (as an example) looks like this:
Note furthermore, that what follows here are technical details about the implementation of str keys as well as the meaning of hex keys. You do not need this knowledge (you can, in fact, ignore the hex key since everything surrounding it is automated), but if you're interested - feel free to read on :)
What is the purpose of the hex key? One problem of using strings as a key to look up elements is, that they are (relatively) slow to process. This is especially problematic, if they are potentially used a lot. Therefore an optimization is used.
Upon encoding into .w3strings, any string key is hashed into a hex key, which is stored instead of the original str key in the entry. Hex keys are far more efficient for look up algorithms: Thus when we use a string key like "boss" in the picture above, the engine won't search for the "boss" in the database, but hash "boss" to its hex key and then search for that.
There is one major caveat to this procedure: The hashing procedure can produce collisions. This means, it is possible that two different string keys are hashed to the same hash key. Unfortunately we cannot catch this error, so only the advice remains, to use string keys only when necessary.
There is one last aspect to the strings concept of the engine and that is defining separate strings for several languages. The idea here is simple: Each .w3strings has a given language associated to it. The language is specified by using abbreviations like "en", "de" as a prefix for the .w3strings filename:
A .w3strings file can thus be localized by creating several language versions of it (with the respective prefixes), where each version contains the same keys, but string values respectively translated string values. The engine will then automatically choose the right version of the strings based on the user's language setting.
Strings in Radish Modding Tools
The previous three sections discussed how strings are handled in RedEngine3 using .w3strings files as a database. Now we will proceed to how the radish modding tools are built and geared towards that concept to allow defining and encoding custom strings in a user-friendly way (relatively, that is).
Input format for encoding with w3strings.exe
First, we will look at how w3strings.exe, an executable included in the modding tools (there is some info in .../radish-tools/docs.strings as well), encodes a human readable list of strings into a .w3strings file. The encoder (= w3strings.exe) accepts CSVs (UTF-8 without BOM) formatted in a certain way as input. A typical input CSV might look like this:
The first line defines the language prefix, which determines language specific encoding and is used as filename for the resulting .w3strings. The second line is a comment, which shows what each column means:Column one is the ID, column 2 is the hex-key. column 3 is the string-key, column 4 is the text, which is the actual string we want to add/replace.
Important: DON'T open string CSVs with Excel, since it changes the formatting.
This format is a mirror of a .w3string file. The first line corresponds to the filename, from line 3 on every line corresponds to an entry of the encoded file (see 2.1). Therefore, the ID needs to have 10 digits. Under text you can write UTF-8 characters and things like line breaks (</br>) and for string keys anything in " a-z0-9_" is allowed. However, for IDs there are some additional constraints imposed by w3strings.exe.
The w3strings.exe encoder intentionally restricts the IDs usable for an encoding of one CSV (= the ID space) in order to "revent multiple mods to overwrite strings of other mods". To ensure this, two conditions are required:
- any ID used as input for the encoder needs to start with 211
- IDs used in one input CSV need to specify an idspace = nnnn, and fulfill that any ID inside follow the format 211nnnnxxx with xxx being free to use.
Condition 1. ensures that modders using this encoder can avoid conflicts with strings present in the vanilla game, since 211 was found to be an unused prefix.
Condition 2. ensures that modders using this encoder can avoid conflicts among them, if they use different idspaces. A way to do this proposed by the creator is using the mods Nexus ID as idspace.
As an example: The CSV file shown in section 3.1 only contains IDs prefixed with 211 and with idspace = 1410. Thus this mod is guranteed conflict-free with any string defined in vanilla tw3 + dlcs and any encoder-using authors who chose a different idspace.
Automated string extraction & encoding with the Project Template | idstart
The radish modding tools come with a tailored project template, a folder of specialized folders containing batch scripts for automated encoder calls, ressource collection and mod/dlc deployment. Since the project template (naturally) uses w3strings.exe (though hidded by a chain batch script calls), we need to define the idspace for the input CSV that will be generated for the project. See the previous section for information on idspaces.
First of all, you need to specify the idspace in "./_settings_.bat" - this value is used as validation input for w3strings.exe - it checks that every ID used in the project has indeed the same idspace. Now on to the places, where strings are used.
In a radish project, there is a defined subset of places (quest production settings, journals, scenes), where custom strings are defined by the modder. Upon running e.g. full.rebuild.bat, the tools will automatically search those places for strings, assign IDs to them and generate string CSV files in the 'strings' dir where the assignments are listed:
The generated string CSVs are merged into one all.en.strings.csv, which is the input for w3strings.exe.
Since the automatic generation of IDs is done separately we need to specify the idspace chosen for the project in each place again. Having different parts begs a question: We know from section 3.2 that via prefix and idspace we can assume to have a unique range of IDs to use for our project. But how can we ensure the IDs to be generated distinct internally in the different places? This is the motivation for the idstart setting, which must be specified for each place. Let's look at a generic radish string ID:
Here nnnn = idspace, which must be the same for all places and _settings_.bat. However, if xxx starts at 0 for each place, then we'll have a conflict. Thus, the idstart setting, which is an offset added to xxx, has to be manually adjusted to a different value for each place. If we for instance specify idspace = 100 for quest production and idspace = 200 for our first scene, then those won't conflict if there aren't more than 99 strings relying on the quest production settings.
Thus you need to set the same idspace and different idstarts in these places:
- the quest production settings (./definition.quest/prod.quest-newquestproject.yml), which provide IDs for e.g. quest name and caption, journal strings
- each scene (./definition.scenes/scenes.*.yml), where IDs are needed for the lines spoken by actors
In a scene dumped from the storyboard UI mod, you will find the idspace/idstart settings under the "production" section.
Note that usually idstarts in 100-intervalls are a good default. However, if you have more than nine places using strings to cover (which is mostly caused by scenes), you might need to adjust the idstarts in a more customized way.
Custom strings in the 'strings' directory
In the ./strings folder of radish project templates string CSV files are stored to be used as input when the time for encoding has come (see section 3.1). As discussed in the previous section, string CSVs are automatically generated in this dir for each place (e.g. scenes). But what if you want to define strings which aren't covered by the automatic generation?
In this case, you can simply add another strings CSV file to the ./strings directory. Be sure, that each entry you specify has an ID that is unique in your project - and also be sure that you don't choose a filename that would be written by the automatic generation (e.g. avoid 'all.en.strings.csv').