About UDMF again...

Advanced OpenGL source port fork from ZDoom, picking up where ZDoomGL left off.
[Home] [Download] [Git builds (Win)] [Git builds (Mac)] [Wiki] [Repo] [Bugs&Suggestions]

Moderator: Graf Zahl

User avatar
Logan MTM
Posts: 179
Joined: Wed Jan 04, 2006 2:52
Location: Rio de Janeiro - BRAZIL
Contact:

About UDMF again...

Post by Logan MTM »

This is the Map i'm making ( 5% ) for "20 Years of Doom" project:
Image

Now, there is the problem:
Spoiler:
If the TEXTMAP size is 1,28 MBs about 5%, that means it'll get (+-) 28~30 MBs about 100%!

Now the Math:

MegaWad with 32 Maps x 30 MBs per Map = 960 MBs of TEXTMAPs!?!?!


That is right? :shock:
Last edited by Logan MTM on Mon Jan 25, 2010 14:38, edited 1 time in total.
So as you read this know my friends I'd love to stay with you all.
Please smile when you think of me. My body's gone that's all...
User avatar
Gez
Developer
Developer
Posts: 1399
Joined: Mon Oct 22, 2007 16:47

Re: About UDMF again...

Post by Gez »

Yes. The textmap format takes a lot more space than the binary format.

However, the bright side is that it compresses pretty well. If you don't make a megawad but a "megapack" (pk3 or, better yet, pk7), where each map wad is compressed, it shouldn't take that much more space than with Hexen format.
User avatar
Graf Zahl
GZDoom Developer
GZDoom Developer
Posts: 7148
Joined: Wed Jul 20, 2005 9:48
Location: Germany
Contact:

Re: About UDMF again...

Post by Graf Zahl »

UDMF compressed is ca 20-25% larger than binary compressed. There's really no need to release anything with UDMF in WAD format as it's ZDoom only.
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

I agree that text format doom maps is a bad idea. Text format is for human-computer interface where the human wants to use a text based interface. But who want to use a text editor to "draw" a Doom map? It is quite obvious that it is done much better with a map editor where you can see what you are doing. In this case a binary format is much better. It is easier to read (in some cases you can just read an entire object from a file without any conversion) and write. And we should not speak about what a wave file (or probably even worse, a high resolution picture) would have looked like if it used XML:

Code: Select all

<frame time="123454">
    <sample channel="left" value="123" />
    <sample channel="right" value="123" />
</frame>
Is there any reason why is UDMF a text based format?
User avatar
Graf Zahl
GZDoom Developer
GZDoom Developer
Posts: 7148
Joined: Wed Jul 20, 2005 9:48
Location: Germany
Contact:

Re: About UDMF again...

Post by Graf Zahl »

The reason is extensibility. In a text format you can add an unlimited amount of new properties. A binary format will forever be locked to what it was designed to do at the time it was created.

And now take one guess why definition formats like XML have become so popular that even word processors are using it as the main format to store their data. Yes, both MSOffice and OpenOffice have ditched their binary formats in favor of a text file representation, too.

Binary formats are a dead end everywhere where feature sets are evolving and need adjustment in the stored data. (No, make that: Binary formats are a dead end. Period.)
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

Graf Zahl wrote:The reason is extensibility. In a text format you can add an unlimited amount of new properties. A binary format will forever be locked to what it was designed to do at the time it was created.
You can use a class ID and size for all classes. This two fields are added at the beginning of every record type.

Code: Select all

struct Foo
{
unsigned int classID;
unsigned int size; //or unsigned long long int if really large objects are needed
//fields of Foo

};
The only problem left now is to standardize class ID:s but that is a problem with any file format.

Binary formats are not dead. Many of the commonly used media file formats are binary: JPEG RIFF-WAVE MP3.

Text format works probably well when used to describe text because much of the data stored is text. Therefore it works well for describing formatted documents too. But for a large set of numbers it is a very bad idea. It is important to note that if program a writes a floating point number in decimal and then program b reads that number, there is a risk of precision loss.

XML stores the name of each field for each record which is the worst with this file format. If I want to use a text format to store a larger data set I can use csv format, which do not repeat this information, instead.
User avatar
Gez
Developer
Developer
Posts: 1399
Joined: Mon Oct 22, 2007 16:47

Re: About UDMF again...

Post by Gez »

milasudril wrote:XML stores the name of each field for each record which is the worst with this file format. If I want to use a text format to store a larger data set I can use csv format, which do not repeat this information, instead.
And then, what if you need to enhance a field with an additional, optional parameter, how does CSV fare?

Repeated text is not a real problem as soon as you bring in compression.

Sure, they could have used the same binary structure as the Hexen format, just with 64-bit words instead of 8-bit or 16-bit words. Maybe with a few additional fields as well, such as lineid. Woo. Limits are gone, cool. How would it help? What if a port introduces specials with 12 parameters, instead of just 5? The format has to be redefined for this port. Or you've got to go through some cumbersome workaround, like Hexen's Line_SetID or Eternity's ExtraData mechanism.

With UDMF, it's not a problem. You just list the additional params, the same format is used. Any UDMF-capable editor supports your added values without having to update its code and rebuild it.

About precision loss: it's not greater than that from fixed point.
User avatar
Graf Zahl
GZDoom Developer
GZDoom Developer
Posts: 7148
Joined: Wed Jul 20, 2005 9:48
Location: Germany
Contact:

Re: About UDMF again...

Post by Graf Zahl »

Gez wrote: Sure, they could have used the same binary structure as the Hexen format, just with 64-bit words instead of 8-bit or 16-bit words. Maybe with a few additional fields as well, such as lineid. Woo. Limits are gone, cool. How would it help?

For the record, before UDMF there were such attempts. I strongly opposed all of them for the sole reason of the mess involved in them. Some cooked up some horrendously convoluted schemes of metadata and optional fields that just made my head hurt.

Fortunately, in the end all of these attempts died quickly without ever being heard from again. The sad thing is that without a certain person (Deep, a.k.a randomlag) we would have had something like UDMF much earlier but he persistently sabotaged the discussion so that it went nowhere - the probable motivation that it would have meant a lot of work for him on DeepSea...

Back to the text vs. binary approach:

@milasudril: You clearly have no idea what you are talking about. If you design a new format for *anything* that might need future expansion you simply cannot afford to think in such minimalistic terms. It is of paramount importance that such format is open ended should the need for something new arise. There has to be some means to add this new stuff.

Any binary format is by definition out of the picture here because by its very nature it's not expandable.
CSV is also useless because there's no direct association between a property and its value. This is clearly saving space at the wrong place.

And in the end: What does it matter? Yes, UDMF maps can easily become 20 MB of raw text but we are talking about monstrously huge and complex maps here that you shouldn't even bother starting on a machine with less than 1GB of RAM. The memory footprint may sound enormous but let's not forget that maps are normally loaded at a time when no other resources are in memory. So right after the map data gets deleted the engine starts to load textures and other stuff which normally require 3-4 times as much. As an example, KDiZD's Z1M10 is approx. 10 MB of USMF map size but fully initialized with all textures you need 50MB of RAM to load all the data that's needed to play this map. It wouldn't even work on systems that have problems loading the 10MB textmap lump.

As for distribution of such maps, any port which supports UDMF also supports loading Zips/PK3's as WAD replacement so the only real issue in the end is compressed file size - and there UDMF is 1.3 - 1.5 times the size of the binary format. To me that's a non issue compared to the gain in flexibility.
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

Gez wrote:How would it help? What if a port introduces specials with 12 parameters, instead of just 5?
That is why the all structs should include classID and size. For example: The first version introduces fields

Code: Select all

int foo;
int bar;
Now, the size of this structure is 8 byte. If the new version wants more information the size field just is increased. Perhaps it now looks like this:

Code: Select all

int foo;
int bar;
int baz;
and so on. BTW this technique is used frequently in Windows API
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

Graf Zahl wrote: Any binary format is by definition out of the picture here because by its very nature it's not expandable.
CSV is also useless because there's no direct association between a property and its value. This is clearly saving space at the wrong place.
Use a header that tells what kind of that comes and in what order.
User avatar
Graf Zahl
GZDoom Developer
GZDoom Developer
Posts: 7148
Joined: Wed Jul 20, 2005 9:48
Location: Germany
Contact:

Re: About UDMF again...

Post by Graf Zahl »

Thank god I don't have to work with you. You absolutely don't get it, do you? I wouldn't want to work with such a messy format - ever!

Now take one guess why formats like XML, JSON or any other comparable format do not do such nonsense as you suggest.

Just a reminder: We no longer live in an age where file size is the most important factor when defining a data format.
User avatar
Gez
Developer
Developer
Posts: 1399
Joined: Mon Oct 22, 2007 16:47

Re: About UDMF again...

Post by Gez »

milasudril wrote:
Gez wrote:How would it help? What if a port introduces specials with 12 parameters, instead of just 5?
That is why the all structs should include classID and size. For example: The first version introduces fields

Code: Select all

int foo;
int bar;
Now, the size of this structure is 8 byte. If the new version wants more information the size field just is increased. Perhaps it now looks like this:

Code: Select all

int foo;
int bar;
int baz;
and so on. BTW this technique is used frequently in Windows API
So, basically, each field needs to be introduced by a text field that tells its name and size. Great! You just introduced the multiple redundancy you wanted to avoid.
milasudril wrote:Use a header that tells what kind of that comes and in what order.
It wouldn't help any for additional, optional parameters. You add a field to something, and you have to reformat entirely all your CSV files before they can use this new field...

Anyway, the ship has sailed. The UDMF specs are defined and approved. There's a level editor and several ports that support it, plus various miscellaneous tools, and more coming. If you don't like it, tough luck for you.
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

Graf Zahl wrote:Just a reminder: We no longer live in an age where file size is the most important factor when defining a data format.
Just a reminder: everything that can be optimized should be optimized... The storage format may have a significant impact on load time. Decompress, dynamic string allocation, parse ugh...
milasudril
Posts: 64
Joined: Fri May 15, 2009 17:21

Re: About UDMF again...

Post by milasudril »

Gez wrote: So, basically, each field needs to be introduced by a text field that tells its name and size. Great! You just introduced the multiple redundancy you wanted to avoid.
No, the description is NOT stored in the file. It is probably found in some .h file. For an example of a file format using this technique look at the bitmap file header. Version 5 is such an extension to version 4, that is such an extension to version 3. The OS2 version differs, yes.
Gez wrote:It wouldn't help any for additional, optional parameters. You add a field to something, and you have to reformat entirely all your CSV files before they can use this new field...
No, the old one use the old header. The newer file format uses the new header. The header solves the problem. An mathematical analogy

0=(x-a)(x-b)=...

if i expand this parenthesis I clearly waste time to solve this equation.
Anyway, the ship has sailed. The UDMF specs are defined and approved. There's a level editor and several ports that support it, plus various miscellaneous tools, and more coming. If you don't like it, tough luck for you.
If there were more than one way to represent the same kind of data, I would have used file format plug-ins. And perhaps I will sink that ship...
User avatar
Graf Zahl
GZDoom Developer
GZDoom Developer
Posts: 7148
Joined: Wed Jul 20, 2005 9:48
Location: Germany
Contact:

Re: About UDMF again...

Post by Graf Zahl »

Sorry but LOL!

We should be long past the times where everything should optimized for performance. I'd rather optimize for maximum usability rather than use some bastard format that's neither binary nor flexible.

As for load times: Insignificant. I did some tests with an UDMF'd version of KDiZD's Z1M10 which to this date is the largest map available and the load times for the textmap is a) far less than a second even if uncompressed and b) insignificant in relation to all the things that need to be done to start the map (meaning: setting up internal data structures, spawning actors, loading textures, caching sounds and whatever else is needed.


Gez wrote:Anyway, the ship has sailed. The UDMF specs are defined and approved. There's a level editor and several ports that support it, plus various miscellaneous tools, and more coming. If you don't like it, tough luck for you.
And that of course. Just a reminder: The people who designed this format were the ones who have been/are going to be the ones working most closely with the new format and all its implications so you should at leasr assume that they knew what they were doing. Where was your input when it happened? (Not that it'd matter because you'd probably have been laughed at or ridiculed for your ideas.
Locked

Return to “GZDoom”