2.2-:develop-15c5a27-20141116 suddenly keeps shutting down database

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

2.2-:develop-15c5a27-20141116 suddenly keeps shutting down database

Brad Clements
Hi,

We've been running eXist 2.2-:develop-15c5a27-20141116 for 2 couple of
years on Ubuntu 14.04 on Java 1.7.0_121 64Bit without problems.

This weekend I edited conf.xml and adjusted the consistency-check
settings to change when full and incrementals are done.

I restarted eXist and went home thinking all was well. but now I find
this morning that it's not actually working at all.

It seems to be putting the database into readonly mode but I cannot find
out why.

after restart, I try editing a file via webdav and see this:

2017-04-03 08:01:43,878 [eXistThread-35] FATAL (BrokerPool.java
[isReadOnly]:1334) - Partition containing DATA_DIR:
/opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck is running out of disk
space. Switching eXist-db to read only to prevent data loss!
2017-04-03 08:01:43,879 [eXistThread-35] INFO (BrokerPool.java
[setReadOnly]:1343) - Switching to read-only mode!!!
2017-04-03 08:01:43,879 [eXistThread-35] ERROR (ExistCollection.java
[createFile]:447) - org.exist.security.PermissionDeniedException:
Database is read-only
2017-04-03 08:01:43,880 [eXistThread-35] WARN (TransactionManager.java
[execute]:216) - Failed to write abort record to journal: Database is
shut down.
2017-04-03 08:01:43,881 [eXistThread-35] WARN (Journal.java
[flushToLog]:290) - Failed to trigger checkpoint!

But we have plenty of space

/dev/mapper/exist1--vg-opt          32G   13G   18G  43% /opt

and plentyof inodes

/dev/mapper/exist1--vg-opt        2097152 283992 1813160 14% /opt


I temporarily changed conf.xml minDiskSpace to -1

Now I get a different error

2017-04-03 08:15:31,246 [eXistThread-35] INFO  (HttpManager.java
[process]:213) - PROPFIND ::
http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml -
http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml
2017-04-03 08:15:31,248 [eXistThread-35] INFO (MiltonResource.java
[authorise]:350) - PROPFIND /db/orgs/sfiusa/test.xml (write=false)
2017-04-03 08:15:31,250 [eXistThread-35] INFO (MiltonResource.java
[authorise]:350) - PROPFIND /db/orgs/sfiusa/test.xml (write=false)
2017-04-03 08:15:31,252 [eXistThread-35] INFO (HttpManager.java
[process]:213) - UNLOCK ::
http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml -
http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml
2017-04-03 08:15:31,253 [eXistThread-35] INFO (MiltonResource.java
[authorise]:350) - UNLOCK /db/orgs/sfiusa/test.xml (write=true)
2017-04-03 08:15:31,253 [eXistThread-35] WARN (TransactionManager.java
[execute]:164) - Failed to create transaction. Error writing to log file.

I have rebooted the server and it's not coming up with fsck errors, so I
am pretty confused why it's failing to write to disk.

The java process runs as root, most files are owned by a regular user..
I don't recall a 'run as' userid setting anywhere.



It does seem to be actually writing to disk, given the dates the data
files were touched.

-rw-rw-r--  1 bkc  bkc  3771809792 Apr  3 08:17 dom.dbx
drwxrwxr-x  3 bkc  bkc       12288 Apr  3 08:17 range/
drwxrwxr-x  3 bkc  bkc        4096 Apr  3 08:17 lucene/
-rw-rw-r--  1 bkc  bkc    43859968 Apr  3 08:17 collections.dbx
-rw-rw-r--  1 bkc  bkc  4906831872 Apr  3 08:15 structure.dbx
-rw-rw-r--  1 bkc  bkc       36401 Apr  3 08:15 symbols.dbx
drwxrwxr-x  8 bkc  bkc        4096 Apr  3 08:15 ./
-rw-r--r--  1 root root        113 Apr  3 08:15 000000073f.log
drwxrwxr-x  2 bkc  bkc      172032 Mar 31 16:18 fs.journal/
drwxr-xr-x  3 bkc  bkc        4096 Jan  5 20:16 export/
-rw-r--r--  1 bkc  bkc           0 Jan 13  2016 counters
-rw-r--r--  1 bkc  bkc         140 Jan 30  2015 jmxservlet.token
drwxrwxr-x 15 bkc  bkc        4096 Jan 29  2015 expathrepo/
-rw-rw-r--  1 bkc  bkc       12288 Jan 29  2015 values.dbx
-rw-rw-r--  1 bkc  bkc      647168 Jan 29  2015 ngram.dbx
-rw-rw-r--  1 bkc  bkc        1085 Jan 29  2015 restxq.registry
drwxrwxr-x  3 bkc  bkc        4096 Jan 29  2015 fs/
-rw-rw-r--  1 bkc  bkc        8192 Jan 29  2015 sort.dbx
drwxrwxr-x  7 bkc  bkc        4096 Jan 29  2015 ../
-rw-rw-r--  1 bkc  bkc           0 Jul  1  2014 .DO_NOT_DELETE

Does anyone have any suggestions?

Thanks



--
Brad Clements,                [hidden email]    (315)268-1000
Jabber/XMPP: [hidden email]


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 2.2-:develop-15c5a27-20141116 suddenly keeps shutting down database

Brad Clements
Please disregard this

Problem was caused by malformed conf.xml, having invalid hour '1900' instead of 19


See this error on conf.xml

        <job type="system" name="check1"
            class="org.exist.storage.ConsistencyCheckTask"
            cron-trigger="0 0 7,13,1900 * * ?">
            <parameter name="output" value="/backup/exist"/>
            <parameter name="backup" value="yes"/>
            <parameter name="incremental" value="yes"/>
            <parameter name="incremental-check" value="no"/>
        </job>


it's odd that exist runs, stays up, responds to http requests (so nagios is happy) but doesn't actually function. This problem  scrolls by pretty fast in the log file..  Then later errors report 'out of disk space' which is not the actual error.


2017-04-03 08:54:15,404 [WrapperListener_start_runner] INFO  (NativeBroker.java [sync]:3588) - Memory: 1,005,568K total; 3,728,384K max; 779,044K free
2017-04-03 08:54:15,404 [WrapperListener_start_runner] INFO  (BTree.java [printStatistics]:2549) - dom.dbx INDEX Buffers occupation : 86% (55 out of 64) Cache efficiency : 98%
2017-04-03 08:54:15,404 [WrapperListener_start_runner] INFO  (DOMFile.java [printStatistics]:1360) - dom.dbx DATA Buffers occupation : 10% (25 out of 256) Cache efficiency : 99%
2017-04-03 08:54:15,404 [WrapperListener_start_runner] INFO  (BTree.java [printStatistics]:2549) - collections.dbx INDEX Buffers occupation : 73% (47 out of 64) Cache efficiency : 81%
2017-04-03 08:54:15,404 [WrapperListener_start_runner] INFO  (BFile.java [printStatistics]:436) - collections.dbx DATA Buffers occupation : 17% (11 out of 64) Cache efficiency : 84%
2017-04-03 08:54:15,405 [WrapperListener_start_runner] INFO  (BTree.java [printStatistics]:2549) - values.dbx INDEX Buffers occupation : 0% (0 out of 64) Cache efficiency : N/A
2017-04-03 08:54:15,405 [WrapperListener_start_runner] INFO  (BFile.java [printStatistics]:436) - values.dbx DATA Buffers occupation : 0% (0 out of 64) Cache efficiency : N/A
2017-04-03 08:54:15,431 [WrapperListener_start_runner] INFO  (Journal.java [shutdown]:484) - Transaction journal cleanly shutting down with checkpoint...
2017-04-03 08:54:15,431 [WrapperListener_start_runner] INFO  (FileLock.java [release]:189) - Deleting lock file: /opt/eXist-2.2/webapp/WEB-INF/data/journal.lck
2017-04-03 08:54:15,436 [WrapperListener_start_runner] INFO  (FileLock.java [release]:189) - Deleting lock file: /opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck
2017-04-03 08:54:15,437 [WrapperListener_start_runner] ERROR (BrokerPool.java [configure]:267) - Unable to initialize database instance 'exist': CronExpression '0 0 7,13,1900 * * ?' is invalid,.
org.exist.EXistException: CronExpression '0 0 7,13,1900 * * ?' is invalid,.
        at org.exist.storage.BrokerPool.initialize(BrokerPool.java:1059)
        at org.exist.storage.BrokerPool.<init>(BrokerPool.java:715)
        at org.exist.storage.BrokerPool.configure(BrokerPool.java:249)
        at org.exist.storage.BrokerPool.configure(BrokerPool.java:223)
        at org.exist.jetty.JettyStart.run(JettyStart.java:166)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.exist.wrapper.Main.start(Main.java:95)
        at org.tanukisoftware.wrapper.WrapperManager$11.run(WrapperManager.java:4146)
Caused by: java.lang.RuntimeException: CronExpression '0 0 7,13,1900 * * ?' is invalid,.
        at org.quartz.CronScheduleBuilder.cronSchedule(CronScheduleBuilder.java:105)
        at org.exist.scheduler.impl.QuartzSchedulerImpl.createCronJob(QuartzSchedulerImpl.java:331)
        at org.exist.scheduler.impl.QuartzSchedulerImpl.createCronJob(QuartzSchedulerImpl.java:306)
        at org.exist.scheduler.impl.QuartzSchedulerImpl.setupConfiguredJobs(QuartzSchedulerImpl.java:535)
        at org.exist.scheduler.impl.QuartzSchedulerImpl.run(QuartzSchedulerImpl.java:147)
        at org.exist.storage.BrokerPool.initialize(BrokerPool.java:1045)
        ... 10 more
Caused by: java.text.ParseException: Hour values must be between 0 and 23
        at org.quartz.CronExpression.addToSet(CronExpression.java:948)
        at org.quartz.CronExpression.checkNext(CronExpression.java:698)
        at org.quartz.CronExpression.storeExpressionVals(CronExpression.java:681)
        at org.quartz.CronExpression.buildExpression(CronExpression.java:463)
        at org.quartz.CronExpression.<init>(CronExpression.java:276)
        at org.quartz.CronScheduleBuilder.cronSchedule(CronScheduleBuilder.java:102)
        ... 15 more
2017-04-03 08:54:15,998 [WrapperListener_start_runner] INFO  (JettyStart.java [lifeCycleStarting]:413) - Jetty server starting...
2017-04-03 08:54:17,879 [WrapperListener_start_runner] INFO  (AbstractExistHttpServlet.java [getOrCreateBrokerPool]:96) - Database already started. Skipping configuration ...
2017-04-03 08:54:17,879 [WrapperListener_start_runner] INFO  (AbstractExistHttpServlet.java [doGeneralExistServletConfig]:198) - Using default user guest for all unauthorized requests.
2017-04-03 08:54:17,894 [WrapperListener_start_runner] INFO  (XmlLibraryChecker.java [check]:149) - Looking for a valid Parser...
Checking for Xerces, found version Xerces-J 2.11.0
OK!
 
2017-04-03 08:54:17,896 [WrapperListener_start_runner] INFO  (XmlLibraryChecker.java [check]:184) - Looking for a valid Resolver...
Checking for Resolver, found version XmlResolver 1.2
OK!
 
2017-04-03 08:54:17,898 [WrapperListener_start_runner] INFO  (XmlLibraryChecker.java [check]:191) - Using parser org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
2017-04-03 08:54:17,898 [WrapperListener_start_runner] INFO  (XmlLibraryChecker.java [check]:192) - Using transformer net.sf.saxon.IdentityTransformer
2017-04-03 08:54:18,571 [WrapperListener_start_runner] INFO  (JettyStart.java [lifeCycleStarted]:421) - Jetty server started.
2017-04-03 08:54:18,572 [WrapperListener_start_runner] INFO  (JettyStart.java [run]:233) - -----------------------------------------------------
2017-04-03 08:54:18,572 [WrapperListener_start_runner] INFO  (JettyStart.java [run]:234) - Server has started on ports 8680 8443. Configured contexts:
2017-04-03 08:54:18,572 [WrapperListener_start_runner] INFO  (JettyStart.java [run]:243) - '/exist'
2017-04-03 08:54:18,573 [WrapperListener_start_runner] INFO  (JettyStart.java [run]:281) - -----------------------------------------------------
2017-04-03 08:54:20,262 [eXistThread-36] INFO  (Descriptor.java [<init>]:102) - Reading Descriptor from file ../../../descriptor.xml
2017-04-03 08:54:20,581 [eXistThread-36] INFO  (RpcConnection.java [doQuery]:335) - query took 185ms.
2017-04-03 08:54:20,615 [eXistThread-34] DEBUG (RpcConnection.java [releaseQueryResult]:3046) - removed query result with handle 0
2017-04-03 08:54:40,165 [eXistThread-33] FATAL (BrokerPool.java [isReadOnly]:1334) - Partition containing DATA_DIR: /opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck is running out of disk space. Switching eXist-db to read only to prevent data loss!
2017-04-03 08:54:40,166 [eXistThread-33] INFO  (BrokerPool.java [setReadOnly]:1343) - Switching to read-only mode!!!
2017-04-03 08:54:40,166 [eXistThread-33] INFO  (NativeBroker.java [<init>]:268) - Database runs in read-only mode
2017-04-03 08:55:05,455 [eXistThread-36] FATAL (BrokerPool.java [isReadOnly]:1334) - Partition containing DATA_DIR: /opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck is running out of disk space. Switching eXist-db to read only to prevent data loss!
2017-04-03 08:55:05,455 [eXistThread-36] INFO  (BrokerPool.java [setReadOnly]:1343) - Switching to read-only mode!!!
2017-04-03 08:55:05,455 [eXistThread-36] INFO  (NativeBroker.java [<init>]:268) - Database runs in read-only mode
2017-04-03 08:55:05,520 [eXistThread-34] FATAL (BrokerPool.java [isReadOnly]:1334) - Partition containing DATA_DIR: /opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck is running out of disk space. Switching eXist-db to read only to prevent data loss!
2017-04-03 08:55:05,521 [eXistThread-34] INFO  (BrokerPool.java [setReadOnly]:1343) - Switching to read-only mode!!!
2017-04-03 08:55:05,521 [eXistThread-34] INFO  (NativeBroker.java [<init>]:268) - Database runs in read-only mode
2017-04-03 08:55:24,099 [eXistThread-34] INFO  (RpcConnection.java [doQuery]:335) - query took 3284ms.



On 04/03/2017 08:27 AM, Brad Clements wrote:
Hi,

We've been running eXist 2.2-:develop-15c5a27-20141116 for 2 couple of years on Ubuntu 14.04 on Java 1.7.0_121 64Bit without problems.

This weekend I edited conf.xml and adjusted the consistency-check settings to change when full and incrementals are done.

I restarted eXist and went home thinking all was well. but now I find this morning that it's not actually working at all.

It seems to be putting the database into readonly mode but I cannot find out why.

after restart, I try editing a file via webdav and see this:

2017-04-03 08:01:43,878 [eXistThread-35] FATAL (BrokerPool.java [isReadOnly]:1334) - Partition containing DATA_DIR: /opt/eXist-2.2/webapp/WEB-INF/data/dbx_dir.lck is running out of disk space. Switching eXist-db to read only to prevent data loss!
2017-04-03 08:01:43,879 [eXistThread-35] INFO (BrokerPool.java [setReadOnly]:1343) - Switching to read-only mode!!!
2017-04-03 08:01:43,879 [eXistThread-35] ERROR (ExistCollection.java [createFile]:447) - org.exist.security.PermissionDeniedException: Database is read-only
2017-04-03 08:01:43,880 [eXistThread-35] WARN (TransactionManager.java [execute]:216) - Failed to write abort record to journal: Database is shut down.
2017-04-03 08:01:43,881 [eXistThread-35] WARN (Journal.java [flushToLog]:290) - Failed to trigger checkpoint!

But we have plenty of space

/dev/mapper/exist1--vg-opt          32G   13G   18G  43% /opt

and plentyof inodes

/dev/mapper/exist1--vg-opt        2097152 283992 1813160 14% /opt


I temporarily changed conf.xml minDiskSpace to -1

Now I get a different error

2017-04-03 08:15:31,246 [eXistThread-35] INFO  (HttpManager.java [process]:213) - PROPFIND :: http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml - http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml
2017-04-03 08:15:31,248 [eXistThread-35] INFO (MiltonResource.java [authorise]:350) - PROPFIND /db/orgs/sfiusa/test.xml (write=false)
2017-04-03 08:15:31,250 [eXistThread-35] INFO (MiltonResource.java [authorise]:350) - PROPFIND /db/orgs/sfiusa/test.xml (write=false)
2017-04-03 08:15:31,252 [eXistThread-35] INFO (HttpManager.java [process]:213) - UNLOCK :: http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml - http://localhost:8680/exist/webdav/db/orgs/sfiusa/test.xml
2017-04-03 08:15:31,253 [eXistThread-35] INFO (MiltonResource.java [authorise]:350) - UNLOCK /db/orgs/sfiusa/test.xml (write=true)
2017-04-03 08:15:31,253 [eXistThread-35] WARN (TransactionManager.java [execute]:164) - Failed to create transaction. Error writing to log file.

I have rebooted the server and it's not coming up with fsck errors, so I am pretty confused why it's failing to write to disk.

The java process runs as root, most files are owned by a regular user.. I don't recall a 'run as' userid setting anywhere.



It does seem to be actually writing to disk, given the dates the data files were touched.

-rw-rw-r--  1 bkc  bkc  3771809792 Apr  3 08:17 dom.dbx
drwxrwxr-x  3 bkc  bkc       12288 Apr  3 08:17 range/
drwxrwxr-x  3 bkc  bkc        4096 Apr  3 08:17 lucene/
-rw-rw-r--  1 bkc  bkc    43859968 Apr  3 08:17 collections.dbx
-rw-rw-r--  1 bkc  bkc  4906831872 Apr  3 08:15 structure.dbx
-rw-rw-r--  1 bkc  bkc       36401 Apr  3 08:15 symbols.dbx
drwxrwxr-x  8 bkc  bkc        4096 Apr  3 08:15 ./
-rw-r--r--  1 root root        113 Apr  3 08:15 000000073f.log
drwxrwxr-x  2 bkc  bkc      172032 Mar 31 16:18 fs.journal/
drwxr-xr-x  3 bkc  bkc        4096 Jan  5 20:16 export/
-rw-r--r--  1 bkc  bkc           0 Jan 13  2016 counters
-rw-r--r--  1 bkc  bkc         140 Jan 30  2015 jmxservlet.token
drwxrwxr-x 15 bkc  bkc        4096 Jan 29  2015 expathrepo/
-rw-rw-r--  1 bkc  bkc       12288 Jan 29  2015 values.dbx
-rw-rw-r--  1 bkc  bkc      647168 Jan 29  2015 ngram.dbx
-rw-rw-r--  1 bkc  bkc        1085 Jan 29  2015 restxq.registry
drwxrwxr-x  3 bkc  bkc        4096 Jan 29  2015 fs/
-rw-rw-r--  1 bkc  bkc        8192 Jan 29  2015 sort.dbx
drwxrwxr-x  7 bkc  bkc        4096 Jan 29  2015 ../
-rw-rw-r--  1 bkc  bkc           0 Jul  1  2014 .DO_NOT_DELETE

Does anyone have any suggestions?

Thanks





-- 
Brad Clements,                [hidden email]    (315)268-1000
Jabber/XMPP: [hidden email]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 2.2-:develop-15c5a27-20141116 suddenly keeps shutting down database

Dannes Wessels-3
In reply to this post by Brad Clements
Hi Brad,

On 3 Apr 2017, at 14:27 , Brad Clements <[hidden email]> wrote:

We've been running eXist 2.2-:develop-15c5a27-20141116 for 2 couple of 
years on Ubuntu 14.04 on Java 1.7.0_121 64Bit without problems.

This weekend I edited conf.xml and adjusted the consistency-check 
settings to change when full and incrementals are done.

I restarted eXist and went home thinking all was well. but now I find 
this morning that it's not actually working at all.

It seems to be putting the database into readonly mode but I cannot find 
out why.


But we have plenty of space


I remember that we have been using an unreliable API to probe the available diskspace. We changed this since then and the problems have gone. I thought only windows server were affected though. On http://exist-open.markmail.org I can’t find the relevant mail thread though….




2017-04-03 08:15:31,253 [eXistThread-35] WARN (TransactionManager.java 
[execute]:164) - Failed to create transaction. Error writing to log file.

I have rebooted the server and it's not coming up with fsck errors, so I 
am pretty confused why it's failing to write to disk.

The java process runs as root, most files are owned by a regular user.. 
I don't recall a 'run as' userid setting anywhere.


Only when running via the service wrapper it is possible to set a user-id.

I am not yet sure what to suggest to continue….

sorry

D.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Exist-open mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/exist-open
Loading...