Ticket #48675

specenum generator

Open Date: 2023-09-19 21:02 Last Update: 2023-10-05 22:03

Reporter:
Owner:
(None)
Type:
Status:
Open
Component:
MileStone:
(None)
Priority:
5 - Medium
Severity:
5 - Medium
Resolution:
None
File:
None

Details

Adjusting specenum values can take a lot of effort, and that need for effort is often multiplied by the number of branches to backport the patch as the big change is unlikely to apply cleanly. The main problem is that we have to number the entries in the sources. So adding or removing entries necessitate changing all the entries that follow.

We could have just spec files describing the enum:

...
VAL_FOO, "foo name"
VAL_BAR, "bar name"
...
and a python source generator script producing a C header with the specenum definitions like they now are maintained by hand.

Subtickets:

  • base implementation
    • #48702 initial implementation (done)
    • #48786 prefix (done)
    • #48789 implicit count and zero identifiers (done)
    • #48790 autogenerate user flags (done)
  • fully new features
  • build system integration
  • migration of existing enums
    • TBD

Ticket History (3/12 Histories)

2023-09-19 21:02 Updated by: cazfi
  • New Ticket "specenum generator" created
2023-09-21 04:26 Updated by: alienvalkyrie
Comment

Couple logistical considerations here. First is the question of what goes into the definition files; just the identifiers and names, or the surrounding info (all the other SPECENUM_* defines) as well. If the former, we'll have to output a separate file for each enum that can't be included on its own — lot of clutter and room for confusion and mistakes. If the latter, we might have to think about extra defines related to the enum that aren't part of the specenum machinery itself (e.g. TOPO_FLAG_BITS, IF_LAST_USER_FLAG); spreading connected information across multiple files (the enum definition and the manually written header that includes the result) isn't great. I'm thinking the full enum definition is likely the better choice.

Second is the question of how many definition and output files. One per enum (on either side) sounds like way too much clutter. One each per header file could work; or we could have a few big definition files (e.g. common, server, client) that each get processed into multiple output headers (this would require the definition to specify which output file a given enum goes into). I'm actually not sure which option I'd prefer here; separate defs for each header means when you do encounter an #include "effects_enums_gen.h", you know to look at effects_enums.def, rather than having to open common_enums.def and ctrl+F for "effects". But it also means there's way more separate definition files. This choice will likely also partially depend on the...

Third question, build system integration, which is far from my area of expertise. We don't want to glob source files for all the issues that can cause, so even if we've got a nice naming scheme (e.g. *_enums.def into *_enums_gen.h), we'll have to maintain lists of all the definition files and all the output (i.e. intermediary source) files in the build. Actually, looking at meson.build, it seems to figure out the headers on its own, not sure about that. There's also the not-yet-EOL autotools build, the internals of which I know approximately zero things about. If we don't actually have to list all the intermediary headers, the "three big defs" approach might be simpler. If we do have to list them all, having a bunch of separate custom targets (one for each def-output pair) could be more reasonable.

Final question, internationalization. For some enums, all or most of the names are internationalized. Simplest solution here would probably be to keep the N_() markers directly in the *_enums.def files and add those to the potfiles; would have to make sure TRANS comments work with that as well. This would also be an argument in favor of having few large defs rather than many small ones.


The script itself should be quite doable. I'd likely copy some things (like comment parsing and general utility functions) from generate_packets.py; might be worth considering if in the future, we'd want to extract those into a separate common utility library (though getting Python to reliably find that when importing might require putting all the scripts into one directory).

For existing things that aren't straightforward (like AI_LEVEL_EXPERIMENTAL only existing in debug builds), we can just leave those as-is for now and add the required functionality later. Then long-term, once we've migrated all specenum uses to this, we can fold specenum_gen.h and generate_specenum.py into it as well.

2023-09-21 11:52 Updated by: cazfi
Comment

These might live under gen_headers, or in their own subdirectory within it. It's the directory autotools build process first, so the results are available for the rest of the build.

With a separate directory, I don't see large number of files big problem. Sure, we need to list them all in the initial implementation, but adding new file later should be rare and easy enough. Separate header per enum would also allow us simply include the exact header where we currently have the definition, and thus there would be no new interdependency headaches. The best solution, however, would be to allow .def to specify what ever amount of enums it wants, and matching *_gen.h gets generated by those same enums. Complicates the naming convention, though (can't have "the single" defined enum in file name)

meson.build currently lists only source files to compile. For *compilation* to work, it does not need to know regular headers. It would need to know list of these headers to generate.

Initial step (first subticket) should be possible without build system integration - expect developer to run the script manually, and keep the resulting header under version control. Shouldn't do that for a large number of generated headers, but just for the first.

2023-09-21 23:03 Updated by: alienvalkyrie
Comment

Reply To cazfi

These might live under gen_headers, or in their own subdirectory within it. It's the directory autotools build process first, so the results are available for the rest of the build.

Good shout. Moving old generate_specenum.py there as well might be sensible too, but not worth doing now if we're planning on eventually folding it into this new script. The individual def files should probably stay with the header that's going to be including them tho.

meson.build currently lists only source files to compile. For *compilation* to work, it does not need to know regular headers. It would need to know list of these headers to generate.

That makes me think it would be best if the build system executes the new script once for each generated file, which also means there should be a one-to-one correspondence between input and output files (so not a "three big definition files" situation). One big input file making multiple output files would mean the definition already has to specify what goes where, so passing the output paths to the script would be redundant, at best a control mechanism; a checklist of "the definition should specify something for each of these files, and should specify no other output files".
The other alternative would be calling it with multiple input files to aggregate into one output file, which could in principle also be reasonable, but in practice here would be the opposite of useful.

I don't think separate definition files for each enum make much sense. It'd mean repeating the name in multiple places (def file name, inside the def, possibly output file name); if we want to rename an enum, we'd have to rename the file, including in version control and build systems; same if we want to add or remove one. Just adds a lot of hassle, especially for potential new contributors who understand code, but are unfamiliar with the stuff around it (e.g. myself ten years ago). By extension, this means I'd go with one def per current header (potentially containing multiple enum definitions), which gets processed into one generated header file (containing all those enum definitions). We will lose the ability to define different enums in different places within the same header.

For naming, I'd generally go foo_enums.def into foo_enums_gen.h for the enums currently in foo.h. Not sure about the couple enums that are defined in .c and .cpp files, but that's a future concern.

Initial step (first subticket) should be possible without build system integration - expect developer to run the script manually, and keep the resulting header under version control. Shouldn't do that for a large number of generated headers, but just for the first.

Sounds reasonable. I take it you'd have this be the metaticket, rather than the ticket for the first step?

Also I consider #44767 (bumping Python version) a blocker for this 'cause of useful new Python features (variable annotations and f-strings).

2023-09-21 23:41 Updated by: cazfi
Comment

Python & autotools considerations: I think we want the generated headers to be part of the tarball, just like output of generate_packets.py is, to avoid making python (with minimum version) requirement for autotools *build* from a tarball. This makes *development* without python unfeasible, but I'm ready to let that, already partial, support to die.

For autotools build system integration, .def files should ideally live in equivalent srcdir to output blddir. (We recently fixed some stale .metainfo.xml issues by also creating the output to bootstrap/ )

(Edited, 2023-09-21 23:42 Updated by: cazfi)
2023-09-23 06:03 Updated by: alienvalkyrie
  • Details Updated
Comment

Made subtickets for the first things #48702 #48703

2023-09-23 06:04 Updated by: alienvalkyrie
  • Details Updated
2023-09-25 20:30 Updated by: alienvalkyrie
Comment

Went through all existing uses of specenum_gen.h to round up nonstandard uses that are potential roadblocks to the full migration. In descending order of severity:

  • There are two cases (unit_role_id in unittype.h and diplrel_other in player.h) where one enum's values follow those of another enum (unit_type_flag_id and diplstate_type, respectively) so they can be used alongside each other
  • AI_LEVEL_EXPERIMENTAL in fc_types.h only exists in debug builds (and it's not the last level, so AI_LEVEL_AWAY has a different index in debug vs non-debug builds)
  • There are enums in both tilespec.h and tilespec.c, as well as both mapimg.h and mapimg.c, complicating the naming scheme
  • Comments around some enums (phase_mode_type and city_acquire_type in fc_types.h) mention raw numeric values used in savegames; we're abstracting those away, so manipulating them for savegame compatibility might become more difficult
  • Some constants tightly tied to certain enums (TOPO_FLAG_BITS and WRAP_FLAG_BITS in fc_types.h) could now no longer be defined right next to the enum. There are more other than these two, but they are mostly defined in terms of the enum constants, so they're not as much of a problem.

I do have ideas on how to deal with all of these, but I'm also hoping some of these can be reworked in other places to become non-issues (or at least easier to work with), rather than having to address each of them in full in the script, in ways that can then interact with each other in undesired ways.

2023-09-25 21:24 Updated by: cazfi
Comment

Savegame values should not be hard coded, regardless of this ticket. For city_acquire_type there might be a ticket already, opened when I added the feature with current temporary savegame handling. In any case it should be resolved before it becomes stable format for the first time.

2023-10-02 05:59 Updated by: cazfi
Comment

Further development idea (bad? good?) after writing my very first .def : #48786

(Edited, 2023-10-02 05:59 Updated by: cazfi)
2023-10-05 18:19 Updated by: alienvalkyrie
  • Details Updated
2023-10-05 22:03 Updated by: alienvalkyrie
  • Details Updated

Attachment File List

No attachments

Edit

Please login to add comment to this ticket » Login