MapLog - Logging world relating things

Hi I am Thomas and I would like to announce the creation of maplog.

What is Maplog:

The primairy goal of maplog is providing a plugin that can rollback or inspect blocks. The secondairy goal is to do the rollbacks in sutch a way moded blocks can handle it. So we can leave the past behind… .

Storage:

As been said in this thread the usage of NBT or flatfiles aren’t really efficient. Maplog will be a SQL only plugin. But I am thinking of making some small script that will setup the database for your.

API?:

Oh this part is funny. The way I will write this plugin it should be able to work on multiple API’s. Examples off these API’s are Sponge, granite, rainbow (kuch) and forge. So the plugin will release 2 jars. A multi-plugin and a forge-mod.

Github :octocat: , CI ?:

Their isn’t a CI setup yet. As the current code doesn’t do anything. But the github is here (currently working in the temp branch). If people want to help me out with it, send me a PM :smile:.

Release date?:

While rainbow and forge having a server release. I am not really in a hurry to finish this. I know I can code well, but I am really unsure (thats in my kind). So at this point I am reading a lot of books to get some more knowledge. I don’t want to end up with a plugin that I can’t update or breaks the database every update. So I can’t announce a release data.

2 Likes

Generally seems to be a good idea for a plugin. Hopefully we will get another plugin or the author of ProtocolLib will decide to port over. In that case the packet faking shouldn’t be too hard on your part. I think the NBT storage method is going to be cool if supported as server owners won’t even have to deal with external backup files. If this is feasible without putting too much load on the forge server it would be a great plugin.

So long as the 2 month part is configurable, or disable purging entirely. Some servers aren’t concerned with the size of databases, and may need to check logs much older than 2 months.

It maybe a better idea to have a scheduler of some sort that can run in quite long frequencies, like a day or half a day, and to scan on server start. On large servers, checking on events can be just as hefty as polling constantly. I’d say just scanning on start would be fine, but I can’t guarantee that all servers restart daily or on any schedule, hence the scheduler. It won’t be much necessary to check more frequently than that I’d assume. At any given point, you’d only be telling the database to remove a day or so (depending on the length of the scheduled tasks) of activity at a time.

Off course this will be configurable, I am just thinking about how the default settings should be so “small” servers can run this to.

For the scheduler I am afraid because of the NBT tags I want to use. In order to read a tag I guess you have to load the chunk itself. So imaging scanning the complete map for tags. Would eat up some resources I guess. NBT has it advantages. But the biggest disadvantage would be that the data gets spread over the map.
Also lets say some player placed a block a year ago. And nothing else had changed since then, wouldn’t it be awesome if the server operator still can find out who placed the block xD (this would only work in case nobody changes something in that chunk thought)?

I can see problems with this, it would cause the the world files to balloon to a very large size after a while, if you don’t have the log data separate from the world files. As far as I’m aware, the world files can’t be reduced in size when you clean up data in them without recreating the whole region file.

Mysql storage support would probably be the most practical and one of the fastest and most widely supported storage types that you can support here, something like sqlite would be a bit slow for something like this.

I tend to like the db structure of Logblock the best of the logging plugins that I’ve used so far, having the separate table per world, makes the db tables a bit more manageable than having all of the log data lumped in just one table. (I wished that dynmap did the same for their db, having a 14 gb table is just ridiculous when you try to do an optimize on the table.)

No this can be cleaned up really easy. But yeah I just realise that map backups also would increase in side. In some way thats good because your sure the rollback data is correct.

Not everyone knows how to setup a mysql server. I know some server owners and most of them are fucking scared of it.

Good idea. Wasn’t really thinking about different worlds yet. But a simple WorldInfo class can fix my current code :smile:.

If you use flat-file databases, I wonder if it would even be remotely plausible to separate the files by chunk/region as well as by world. I’m no expert on how file size affects query performance, but it may be worth doing that.

I suppose it wouldn’t be ideal to do with NBT stored data, and I’m not sure I’d like logs to be stored in them either though. I’m sure most people will prefer them saved elsewhere. I’d even prefer if you could mirror the chunk data to a folder in the plugin where it can be stored in NBT format, but not in the actual world files, so for every file in the world data, there’s a log file in the plugin data. I certainly think executing queries to clean database storage would be much nicer if on a scheduler and server start though. It might exceed query limits per hour if set by the SQL server otherwise.

On that last note, it may also be worth it to queue SQL queries to be executed in timed tasks when logging events, so it doesn’t send separate queries for every event. Also dump the SQL queries to hard disk when more are appended so that if the server shuts down they can be executed when the server restarts.

Yeah I think I skip the NBT tag storage. Its simply not scalable and will probably give problems. Talking about SQL, its something I am really bad in honestly. Is it possible to use Java Lists with a SQL database?

Also just thinking about that. We can’t store a 16Gib log file in the RAM memory :open_mouth:. Dangit I need to rethink this. I thought using a reversible pattern would be good, but its just too much data.

I’m pretty terrible with Java & SQL myself XD I’m still partial to flat-files resembling the world data files, so storing data for a chunk in it’s own file in your plugin data folder. That way ya can fetch, append, read, modify the data quickly enough without even really needing to load the chunk, just by knowing it’s location/file name.

This is one thing that LogBlock didn’t do, which is to coalesce the records into one batch query when adding them to the database, it had separate queries for every entry into the database. Doing this would make adding records to a database a bit faster for “non-local”(same machine as the server.) databases where latency would be a factor.

Well lets give a heads-up in how far I am with this.

First off all I threw away almost all my existing code, as I figured out that filtering data HAS to be done by SQL. I was thinking something like this:

Plugin starts-up and makes a SQL-connection. Where are using an connection-pool that can be configured in how many connections it can have. Also all recorded records don’t get flushed directly to the database. We wait a bit until we … lets say have 500 records and than flush everything to the database. This way we don’t stress out the database and the machine to hard. Also keep in mind I will keep the connection open. I bench-marked a bit with SQL and opening and closing connections is to resource intensive. Off course for the 500 record cache, This also will get checked when you lookup things. This way we limit the database busy messages (all-trough you still would get the message when you lookup on flush, but a flush doesn’t take too long (15ms)).

(note: for people that have a hosting provider that only provides them 1 sql connection, I will add an optional close on flush feature)

It’s true that a persistent connection is far more efficient than using a non-persistent one is.

For the mysql connections, LB uses one persistent connection for everything, and most other plugins only use a single connection for their connections. I’m not sure if having multiple connections would give that much of a boost in db performance when adding or querying records from the db. I guess it would depend on what you want to do with the connections here, have a separate connection per world, in which it “may” be a little more efficient this way, or have a single connection which would probably be a little easier to keep track of and use in a plugin. The benifit of using multiple connections probably won’t be all that apparent until your running the plugin on a very large server with hundreds of players on it.

For the number of records to batch at a single time, you should make it so that it’ll either batch after a certain number of records has been reached or after a certain amount of time has been reached after the previous batch, both of which should be configurable.

Also one thing that you should be aware of, when using mysql, if the db server doesn’t get any activity for long enough from one of its connections, it’ll close the connection due to inactivity. You would need to do a keep alive sort of query every half hour to hour or so for that not to happen, the default timeout for mysql is set to like 8 hours I believe, but a mysql server might have it set to something lower than that. Not all of the plugins that I’ve seen use a keep alive query thing to keep the connection active to the server, which can cause an error message in that plugin when it goes to query something from the db when it has a stale timed out connection.

Updated top-post!.

Is this still even relevant?